WorldWideScience

Sample records for functional genomics platform1cwoa

  1. Genomics Portals: integrative web-platform for mining genomics data.

    Science.gov (United States)

    Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

    2010-01-13

    A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  2. Genomics Portals: integrative web-platform for mining genomics data

    Directory of Open Access Journals (Sweden)

    Ghosh Krishnendu

    2010-01-01

    Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  3. Towards a TILLING platform for functional genomics in Piel de Sapo melons

    Directory of Open Access Journals (Sweden)

    Pujol Marta

    2011-08-01

    Full Text Available Abstract Background The availability of genetic and genomic resources for melon has increased significantly, but functional genomics resources are still limited for this crop. TILLING is a powerful reverse genetics approach that can be utilized to generate novel mutations in candidate genes. A TILLING resource is available for cantalupensis melons, but not for inodorus melons, the other main commercial group. Results A new ethyl methanesulfonate-mutagenized (EMS melon population was generated for the first time in an andromonoecious non-climacteric inodorus Piel de Sapo genetic background. Diverse mutant phenotypes in seedlings, vines and fruits were observed, some of which were of possible commercial interest. The population was first screened for mutations in three target genes involved in disease resistance and fruit quality (Cm-PDS, Cm-eIF4E and Cm-eIFI(iso4E. The same genes were also tilled in the available monoecious and climacteric cantalupensis EMS melon population. The overall mutation density in this first Piel de Sapo TILLING platform was estimated to be 1 mutation/1.5 Mb by screening four additional genes (Cm-ACO1, Cm-NOR, Cm-DET1 and Cm-DHS. Thirty-three point mutations were found for the seven gene targets, six of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was demonstrated for a loss-of-function mutation in the Phytoene desaturase gene, which is involved in carotenoid biosynthesis. Conclusions The TILLING approach was successful at providing new mutations in the genetic background of Piel de Sapo in most of the analyzed genes, even in genes for which natural variation is extremely low. This new resource will facilitate reverse genetics studies in non-climacteric melons, contributing materially to future genomic and breeding studies.

  4. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    Directory of Open Access Journals (Sweden)

    Kim Seungill

    2008-12-01

    Full Text Available Abstract Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets and 34 plant and animal (38 datasets species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site http://genomebrowser.snu.ac.kr/.

  5. MicroScope: a platform for microbial genome annotation and comparative genomics.

    Science.gov (United States)

    Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  6. Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

    Science.gov (United States)

    Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

    2018-04-12

    The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

  7. Babelomics: an integrative platform for the analysis of transcriptomics, proteomics and genomic data with advanced functional profiling

    Science.gov (United States)

    Medina, Ignacio; Carbonell, José; Pulido, Luis; Madeira, Sara C.; Goetz, Stefan; Conesa, Ana; Tárraga, Joaquín; Pascual-Montano, Alberto; Nogales-Cadenas, Ruben; Santoyo, Javier; García, Francisco; Marbà, Martina; Montaner, David; Dopazo, Joaquín

    2010-01-01

    Babelomics is a response to the growing necessity of integrating and analyzing different types of genomic data in an environment that allows an easy functional interpretation of the results. Babelomics includes a complete suite of methods for the analysis of gene expression data that include normalization (covering most commercial platforms), pre-processing, differential gene expression (case-controls, multiclass, survival or continuous values), predictors, clustering; large-scale genotyping assays (case controls and TDTs, and allows population stratification analysis and correction). All these genomic data analysis facilities are integrated and connected to multiple options for the functional interpretation of the experiments. Different methods of functional enrichment or gene set enrichment can be used to understand the functional basis of the experiment analyzed. Many sources of biological information, which include functional (GO, KEGG, Biocarta, Reactome, etc.), regulatory (Transfac, Jaspar, ORegAnno, miRNAs, etc.), text-mining or protein–protein interaction modules can be used for this purpose. Finally a tool for the de novo functional annotation of sequences has been included in the system. This provides support for the functional analysis of non-model species. Mirrors of Babelomics or command line execution of their individual components are now possible. Babelomics is available at http://www.babelomics.org. PMID:20478823

  8. FIGENIX: Intelligent automation of genomic annotation: expertise integration in a new software platform

    Directory of Open Access Journals (Sweden)

    Pontarotti Pierre

    2005-08-01

    Full Text Available Abstract Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes. Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset. The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest.

  9. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  10. A modular open platform for systematic functional studies under physiological conditions

    Science.gov (United States)

    Mulholland, Christopher B.; Smets, Martha; Schmidtmann, Elisabeth; Leidescher, Susanne; Markaki, Yolanda; Hofweber, Mario; Qin, Weihua; Manzo, Massimiliano; Kremmer, Elisabeth; Thanisch, Katharina; Bauer, Christina; Rombaut, Pascaline; Herzog, Franz; Leonhardt, Heinrich; Bultmann, Sebastian

    2015-01-01

    Any profound comprehension of gene function requires detailed information about the subcellular localization, molecular interactions and spatio-temporal dynamics of gene products. We developed a multifunctional integrase (MIN) tag for rapid and versatile genome engineering that serves not only as a genetic entry site for the Bxb1 integrase but also as a novel epitope tag for standardized detection and precipitation. For the systematic study of epigenetic factors, including Dnmt1, Dnmt3a, Dnmt3b, Tet1, Tet2, Tet3 and Uhrf1, we generated MIN-tagged embryonic stem cell lines and created a toolbox of prefabricated modules that can be integrated via Bxb1-mediated recombination. We used these functional modules to study protein interactions and their spatio-temporal dynamics as well as gene expression and specific mutations during cellular differentiation and in response to external stimuli. Our genome engineering strategy provides a versatile open platform for efficient generation of multiple isogenic cell lines to study gene function under physiological conditions. PMID:26007658

  11. The catfish genome database cBARBEL: an informatic platform for genome biology of ictalurid catfish.

    Science.gov (United States)

    Lu, Jianguo; Peatman, Eric; Yang, Qing; Wang, Shaolin; Hu, Zhiliang; Reecy, James; Kucuktas, Huseyin; Liu, Zhanjiang

    2011-01-01

    The catfish genome database, cBARBEL (abbreviated from catfish Breeder And Researcher Bioinformatics Entry Location) is an online open-access database for genome biology of ictalurid catfish (Ictalurus spp.). It serves as a comprehensive, integrative platform for all aspects of catfish genetics, genomics and related data resources. cBARBEL provides BLAST-based, fuzzy and specific search functions, visualization of catfish linkage, physical and integrated maps, a catfish EST contig viewer with SNP information overlay, and GBrowse-based organization of catfish genomic data based on sequence similarity with zebrafish chromosomes. Subsections of the database are tightly related, allowing a user with a sequence or search string of interest to navigate seamlessly from one area to another. As catfish genome sequencing proceeds and ongoing quantitative trait loci (QTL) projects bear fruit, cBARBEL will allow rapid data integration and dissemination within the catfish research community and to interested stakeholders. cBARBEL can be accessed at http://catfishgenome.org.

  12. Creating a platform for collaborative genomic research

    Directory of Open Access Journals (Sweden)

    Mark Smithson

    2017-04-01

    The developed genomics informatics platform provides a step-change in this type of genetic research, accelerating reproducible collaborative research across multiple disparate organisations and data sources, of varying type and complexity.

  13. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    Directory of Open Access Journals (Sweden)

    Moreau Yves

    2005-05-01

    Full Text Available Abstract Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH. One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at http://medgen.ugent.be/arrayCGHbase/.

  14. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  15. PROBING GENOME MAINTENANCE FUNCTIONS OF HUMAN RECQ1

    Directory of Open Access Journals (Sweden)

    Furqan Sami

    2013-03-01

    Full Text Available The RecQ helicases are a highly conserved family of DNA-unwinding enzymes that play key roles in protecting the genome stability in all kingdoms of life.'Human RecQ homologs include RECQ1, BLM, WRN, RECQ4, and RECQ5β.'Although the individual RecQ-related diseases are characterized by a variety of clinical features encompassing growth defects (Bloom Syndrome and Rothmund Thomson Syndrome to premature aging (Werner Syndrome, all these patients have a high risk of cancer predisposition.'Here, we present an overview of recent progress towards elucidating functions of RECQ1 helicase, the most abundant but poorly characterized RecQ homolog in humans.'Consistent with a conserved role in genome stability maintenance, deficiency of RECQ1 results in elevated frequency of spontaneous sister chromatid exchanges, chromosomal instability, increased DNA damage and greater sensitivity to certain genotoxic stress.'Delineating what aspects of RECQ1 catalytic functions contribute to the observed cellular phenotypes, and how this is regulated is critical to establish its biological functions in DNA metabolism.'Recent studies have identified functional specialization of RECQ1 in DNA repair; however, identification of fundamental similarities will be just as critical in developing a unifying theme for RecQ actions, allowing the functions revealed from studying one homolog to be extrapolated and generalized to other RecQ homologs.

  16. compendiumdb: an R package for retrieval and storage of functional genomics data.

    Science.gov (United States)

    Nandal, Umesh K; van Kampen, Antoine H C; Moerland, Perry D

    2016-09-15

    Currently, the Gene Expression Omnibus (GEO) contains public data of over 1 million samples from more than 40 000 microarray-based functional genomics experiments. This provides a rich source of information for novel biological discoveries. However, unlocking this potential often requires retrieving and storing a large number of expression profiles from a wide range of different studies and platforms. The compendiumdb R package provides an environment for downloading functional genomics data from GEO, parsing the information into a local or remote database and interacting with the database using dedicated R functions, thus enabling seamless integration with other tools available in R/Bioconductor. The compendiumdb package is written in R, MySQL and Perl. Source code and binaries are available from CRAN (http://cran.r-project.org/web/packages/compendiumdb/) for all major platforms (Linux, MS Windows and OS X) under the GPLv3 license. p.d.moerland@amc.uva.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Convergent functional genomics of psychiatric disorders.

    Science.gov (United States)

    Niculescu, Alexander B

    2013-10-01

    Genetic and gene expression studies, in humans and animal models of psychiatric and other medical disorders, are becoming increasingly integrated. Particularly for genomics, the convergence and integration of data across species, experimental modalities and technical platforms is providing a fit-to-disease way of extracting reproducible and biologically important signal, in contrast to the fit-to-cohort effect and limited reproducibility of human genetic analyses alone. With the advent of whole-genome sequencing and the realization that a major portion of the non-coding genome may contain regulatory variants, Convergent Functional Genomics (CFG) approaches are going to be essential to identify disease-relevant signal from the tremendous polymorphic variation present in the general population. Such work in psychiatry can provide an example of how to address other genetically complex disorders, and in turn will benefit by incorporating concepts from other areas, such as cancer, cardiovascular diseases, and diabetes. © 2013 Wiley Periodicals, Inc.

  18. Exploring the mycobacteriophage metaproteome: phage genomics as an educational platform.

    Directory of Open Access Journals (Sweden)

    Graham F Hatfull

    2006-06-01

    Full Text Available Bacteriophages are the most abundant forms of life in the biosphere and carry genomes characterized by high genetic diversity and mosaic architectures. The complete sequences of 30 mycobacteriophage genomes show them collectively to encode 101 tRNAs, three tmRNAs, and 3,357 proteins belonging to 1,536 "phamilies" of related sequences, and a statistical analysis predicts that these represent approximately 50% of the total number of phamilies in the mycobacteriophage population. These phamilies contain 2.19 proteins on average; more than half (774 of them contain just a single protein sequence. Only six phamilies have representatives in more than half of the 30 genomes, and only three-encoding tape-measure proteins, lysins, and minor tail proteins-are present in all 30 phages, although these phamilies are themselves highly modular, such that no single amino acid sequence element is present in all 30 mycobacteriophage genomes. Of the 1,536 phamilies, only 230 (15% have amino acid sequence similarity to previously reported proteins, reflecting the enormous genetic diversity of the entire phage population. The abundance and diversity of phages, the simplicity of phage isolation, and the relatively small size of phage genomes support bacteriophage isolation and comparative genomic analysis as a highly suitable platform for discovery-based education.

  19. A chaos wolf optimization algorithm with self-adaptive variable step-size

    Science.gov (United States)

    Zhu, Yong; Jiang, Wanlu; Kong, Xiangdong; Quan, Lingxiao; Zhang, Yongshun

    2017-10-01

    To explore the problem of parameter optimization for complex nonlinear function, a chaos wolf optimization algorithm (CWOA) with self-adaptive variable step-size was proposed. The algorithm was based on the swarm intelligence of wolf pack, which fully simulated the predation behavior and prey distribution way of wolves. It possessed three intelligent behaviors such as migration, summons and siege. And the competition rule as "winner-take-all" and the update mechanism as "survival of the fittest" were also the characteristics of the algorithm. Moreover, it combined the strategies of self-adaptive variable step-size search and chaos optimization. The CWOA was utilized in parameter optimization of twelve typical and complex nonlinear functions. And the obtained results were compared with many existing algorithms, including the classical genetic algorithm, the particle swarm optimization algorithm and the leader wolf pack search algorithm. The investigation results indicate that CWOA possess preferable optimization ability. There are advantages in optimization accuracy and convergence rate. Furthermore, it demonstrates high robustness and global searching ability.

  20. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    Science.gov (United States)

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  1. A comprehensive platform for highly multiplexed mammalian functional genetic screens

    Directory of Open Access Journals (Sweden)

    Cheung-Ong Kahlin

    2011-05-01

    Full Text Available Abstract Background Genome-wide screening in human and mouse cells using RNA interference and open reading frame over-expression libraries is rapidly becoming a viable experimental approach for many research labs. There are a variety of gene expression modulation libraries commercially available, however, detailed and validated protocols as well as the reagents necessary for deconvolving genome-scale gene screens using these libraries are lacking. As a solution, we designed a comprehensive platform for highly multiplexed functional genetic screens in human, mouse and yeast cells using popular, commercially available gene modulation libraries. The Gene Modulation Array Platform (GMAP is a single microarray-based detection solution for deconvolution of loss and gain-of-function pooled screens. Results Experiments with specially constructed lentiviral-based plasmid pools containing ~78,000 shRNAs demonstrated that the GMAP is capable of deconvolving genome-wide shRNA "dropout" screens. Further experiments with a larger, ~90,000 shRNA pool demonstrate that equivalent results are obtained from plasmid pools and from genomic DNA derived from lentivirus infected cells. Parallel testing of large shRNA pools using GMAP and next-generation sequencing methods revealed that the two methods provide valid and complementary approaches to deconvolution of genome-wide shRNA screens. Additional experiments demonstrated that GMAP is equivalent to similar microarray-based products when used for deconvolution of open reading frame over-expression screens. Conclusion Herein, we demonstrate four major applications for the GMAP resource, including deconvolution of pooled RNAi screens in cells with at least 90,000 distinct shRNAs. We also provide detailed methodologies for pooled shRNA screen readout using GMAP and compare next-generation sequencing to GMAP (i.e. microarray based deconvolution methods.

  2. Comparison of Ion Personal Genome Machine Platforms for the Detection of Variants in BRCA1 and BRCA2.

    Science.gov (United States)

    Hwang, Sang Mee; Lee, Ki Chan; Lee, Min Seob; Park, Kyoung Un

    2018-01-01

    Transition to next generation sequencing (NGS) for BRCA1 / BRCA2 analysis in clinical laboratories is ongoing but different platforms and/or data analysis pipelines give different results resulting in difficulties in implementation. We have evaluated the Ion Personal Genome Machine (PGM) Platforms (Ion PGM, Ion PGM Dx, Thermo Fisher Scientific) for the analysis of BRCA1 /2. The results of Ion PGM with OTG-snpcaller, a pipeline based on Torrent mapping alignment program and Genome Analysis Toolkit, from 75 clinical samples and 14 reference DNA samples were compared with Sanger sequencing for BRCA1 / BRCA2 . Ten clinical samples and 14 reference DNA samples were additionally sequenced by Ion PGM Dx with Torrent Suite. Fifty types of variants including 18 pathogenic or variants of unknown significance were identified from 75 clinical samples and known variants of the reference samples were confirmed by Sanger sequencing and/or NGS. One false-negative results were present for Ion PGM/OTG-snpcaller for an indel variant misidentified as a single nucleotide variant. However, eight discordant results were present for Ion PGM Dx/Torrent Suite with both false-positive and -negative results. A 40-bp deletion, a 4-bp deletion and a 1-bp deletion variant was not called and a false-positive deletion was identified. Four other variants were misidentified as another variant. Ion PGM/OTG-snpcaller showed acceptable performance with good concordance with Sanger sequencing. However, Ion PGM Dx/Torrent Suite showed many discrepant results not suitable for use in a clinical laboratory, requiring further optimization of the data analysis for calling variants.

  3. The Epilepsy Phenome/Genome Project (EPGP) informatics platform.

    Science.gov (United States)

    Nesbitt, Gerry; McKenna, Kevin; Mays, Vickie; Carpenter, Alan; Miller, Kevin; Williams, Michael

    2013-04-01

    The Epilepsy Phenome/Genome Project (EPGP) is a large-scale, multi-institutional, collaborative network of 27 epilepsy centers throughout the U.S., Australia, and Argentina, with the objective of collecting detailed phenotypic and genetic data on a large number of epilepsy participants. The goals of EPGP are (1) to perform detailed phenotyping on 3750 participants with specific forms of non-acquired epilepsy and 1500 parents without epilepsy, (2) to obtain DNA samples on these individuals, and (3) to ultimately genotype the samples in order to discover novel genes that cause epilepsy. To carry out the project, a reliable and robust informatics platform was needed for standardized electronic data collection and storage, data quality review, and phenotypic analysis involving cases from multiple sites. EPGP developed its own suite of web-based informatics applications for participant tracking, electronic data collection (using electronic case report forms/surveys), data management, phenotypic data review and validation, specimen tracking, electroencephalograph and neuroimaging storage, and issue tracking. We implemented procedures to train and support end-users at each clinical site. Thus far, 3780 study participants have been enrolled and 20,957 web-based study activities have been completed using this informatics platform. Over 95% of respondents to an end-user satisfaction survey felt that the informatics platform was successful almost always or most of the time. The EPGP informatics platform has successfully and effectively allowed study management and efficient and reliable collection of phenotypic data. Our novel informatics platform met the requirements of a large, multicenter research project. The platform has had a high level of end-user acceptance by principal investigators and study coordinators, and can serve as a model for new tools to support future large scale, collaborative research projects collecting extensive phenotypic data. Copyright © 2012

  4. A chaos wolf optimization algorithm with self-adaptive variable step-size

    Directory of Open Access Journals (Sweden)

    Yong Zhu

    2017-10-01

    Full Text Available To explore the problem of parameter optimization for complex nonlinear function, a chaos wolf optimization algorithm (CWOA with self-adaptive variable step-size was proposed. The algorithm was based on the swarm intelligence of wolf pack, which fully simulated the predation behavior and prey distribution way of wolves. It possessed three intelligent behaviors such as migration, summons and siege. And the competition rule as “winner-take-all” and the update mechanism as “survival of the fittest” were also the characteristics of the algorithm. Moreover, it combined the strategies of self-adaptive variable step-size search and chaos optimization. The CWOA was utilized in parameter optimization of twelve typical and complex nonlinear functions. And the obtained results were compared with many existing algorithms, including the classical genetic algorithm, the particle swarm optimization algorithm and the leader wolf pack search algorithm. The investigation results indicate that CWOA possess preferable optimization ability. There are advantages in optimization accuracy and convergence rate. Furthermore, it demonstrates high robustness and global searching ability.

  5. OxyGene: an innovative platform for investigating oxidative-response genes in whole prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    Barloy-Hubler Frédérique

    2008-12-01

    Full Text Available Abstract Background Oxidative stress is a common stress encountered by living organisms and is due to an imbalance between intracellular reactive oxygen and nitrogen species (ROS, RNS and cellular antioxidant defence. To defend themselves against ROS/RNS, bacteria possess a subsystem of detoxification enzymes, which are classified with regard to their substrates. To identify such enzymes in prokaryotic genomes, different approaches based on similarity, enzyme profiles or patterns exist. Unfortunately, several problems persist in the annotation, classification and naming of these enzymes due mainly to some erroneous entries in databases, mistake propagation, absence of updating and disparity in function description. Description In order to improve the current annotation of oxidative stress subsystems, an innovative platform named OxyGene has been developed. It integrates an original database called OxyDB, holding thoroughly tested anchor-based signatures associated to subfamilies of oxidative stress enzymes, and a new anchor-driven annotator, for ab initio detection of ROS/RNS response genes. All complete Bacterial and Archaeal genomes have been re-annotated, and the results stored in the OxyGene repository can be interrogated via a Graphical User Interface. Conclusion OxyGene enables the exploration and comparative analysis of enzymes belonging to 37 detoxification subclasses in 664 microbial genomes. It proposes a new classification that improves both the ontology and the annotation of the detoxification subsystems in prokaryotic whole genomes, while discovering new ORFs and attributing precise function to hypothetical annotated proteins. OxyGene is freely available at: http://www.umr6026.univ-rennes1.fr/english/home/research/basic/software

  6. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

    Science.gov (United States)

    Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

    2017-10-17

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.

  7. Cloud computing for comparative genomics with windows azure platform.

    Science.gov (United States)

    Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.

  8. Updating the Micro-Tom TILLING platform.

    Science.gov (United States)

    Okabe, Yoshihiro; Ariizumi, Tohru; Ezura, Hiroshi

    2013-03-01

    The dwarf tomato variety Micro-Tom is regarded as a model system for functional genomics studies in tomato. Various tomato genomic tools in the genetic background of Micro-Tom have been established, such as mutant collections, genome information and a metabolomic database. Recent advances in tomato genome sequencing have brought about a significant need for reverse genetics tools that are accessible to the larger community, because a great number of gene sequences have become available from public databases. To meet the requests from the tomato research community, we have developed the Micro-Tom Targeting-Induced Local Lesions IN Genomes (TILLING) platform, which is comprised of more than 5000 EMS-mutagenized lines. The platform serves as a reverse genetics tool for efficiently identifying mutant alleles in parallel with the development of Micro-Tom mutant collections. The combination of Micro-Tom mutant libraries and the TILLING approach enables researchers to accelerate the isolation of desirable mutants for unraveling gene function or breeding. To upgrade the genomic tool of Micro-Tom, the development of a new mutagenized population is underway. In this paper, the current status of the Micro-Tom TILLING platform and its future prospects are described.

  9. Draft Genome of the Pearl Oyster Pinctada fucata: A Platform for Understanding Bivalve Biology

    Science.gov (United States)

    Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Gyoja, Fuki; Tanaka, Makiko; Ikuta, Tetsuro; Shoguchi, Eiichi; Fujiwara, Mayuki; Shinzato, Chuya; Hisata, Kanako; Fujie, Manabu; Usami, Takeshi; Nagai, Kiyohito; Maeyama, Kaoru; Okamoto, Kikuhiko; Aoki, Hideo; Ishikawa, Takashi; Masaoka, Tetsuji; Fujiwara, Atushi; Endo, Kazuyoshi; Endo, Hirotoshi; Nagasawa, Hiromichi; Kinoshita, Shigeharu; Asakawa, Shuichi; Watabe, Shugo; Satoh, Nori

    2012-01-01

    The study of the pearl oyster Pinctada fucata is key to increasing our understanding of the molecular mechanisms involved in pearl biosynthesis and biology of bivalve molluscs. We sequenced ∼1150-Mb genome at ∼40-fold coverage using the Roche 454 GS-FLX and Illumina GAIIx sequencers. The sequences were assembled into contigs with N50 = 1.6 kb (total contig assembly reached to 1024 Mb) and scaffolds with N50 = 14.5 kb. The pearl oyster genome is AT-rich, with a GC content of 34%. DNA transposons, retrotransposons, and tandem repeat elements occupied 0.4, 1.5, and 7.9% of the genome, respectively (a total of 9.8%). Version 1.0 of the P. fucata draft genome contains 23 257 complete gene models, 70% of which are supported by the corresponding expressed sequence tags. The genes include those reported to have an association with bio-mineralization. Genes encoding transcription factors and signal transduction molecules are present in numbers comparable with genomes of other metazoans. Genome-wide molecular phylogeny suggests that the lophotrochozoan represents a distinct clade from ecdysozoans. Our draft genome of the pearl oyster thus provides a platform for the identification of selection markers and genes for calcification, knowledge of which will be important in the pearl industry. PMID:22315334

  10. SL1 revisited: functional analysis of the structure and conformation of HIV-1 genome RNA.

    Science.gov (United States)

    Sakuragi, Sayuri; Yokoyama, Masaru; Shioda, Tatsuo; Sato, Hironori; Sakuragi, Jun-Ichi

    2016-11-11

    The dimer initiation site/dimer linkage sequence (DIS/DLS) region of HIV is located on the 5' end of the viral genome and suggested to form complex secondary/tertiary structures. Within this structure, stem-loop 1 (SL1) is believed to be most important and an essential key to dimerization, since the sequence and predicted secondary structure of SL1 are highly stable and conserved among various virus subtypes. In particular, a six-base palindromic sequence is always present at the hairpin loop of SL1 and the formation of kissing-loop structure at this position between the two strands of genomic RNA is suggested to trigger dimerization. Although the higher-order structure model of SL1 is well accepted and perhaps even undoubted lately, there could be stillroom for consideration to depict the functional SL1 structure while in vivo (in virion or cell). In this study, we performed several analyses to identify the nucleotides and/or basepairing within SL1 which are necessary for HIV-1 genome dimerization, encapsidation, recombination and infectivity. We unexpectedly found that some nucleotides that are believed to contribute the formation of the stem do not impact dimerization or infectivity. On the other hand, we found that one G-C basepair involved in stem formation may serve as an alternative dimer interactive site. We also report on our further investigation of the roles of the palindromic sequences on viral replication. Collectively, we aim to assemble a more-comprehensive functional map of SL1 on the HIV-1 viral life cycle. We discovered several possibilities for a novel structure of SL1 in HIV-1 DLS. The newly proposed structure model suggested that the hairpin loop of SL1 appeared larger, and genome dimerization process might consist of more complicated mechanism than previously understood. Further investigations would be still required to fully understand the genome packaging and dimerization of HIV.

  11. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    Directory of Open Access Journals (Sweden)

    Kok-Gan Chan

    2016-03-01

    Full Text Available Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. Keywords: Human tongue surface, Oral cavity, Oral bacteria, Virulence

  12. First TILLING platform in Cucurbita pepo: a new mutant resource for gene function and crop improvement.

    Science.gov (United States)

    Vicente-Dólera, Nelly; Troadec, Christelle; Moya, Manuel; del Río-Celestino, Mercedes; Pomares-Viciana, Teresa; Bendahmane, Abdelhafid; Picó, Belén; Román, Belén; Gómez, Pedro

    2014-01-01

    Although the availability of genetic and genomic resources for Cucurbita pepo has increased significantly, functional genomic resources are still limited for this crop. In this direction, we have developed a high throughput reverse genetic tool: the first TILLING (Targeting Induced Local Lesions IN Genomes) resource for this species. Additionally, we have used this resource to demonstrate that the previous EMS mutant population we developed has the highest mutation density compared with other cucurbits mutant populations. The overall mutation density in this first C. pepo TILLING platform was estimated to be 1/133 Kb by screening five additional genes. In total, 58 mutations confirmed by sequencing were identified in the five targeted genes, thirteen of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was studied in a peroxidase gene, revealing that the phenotype of seedling homozygous for one of the isolated mutant alleles was albino. These results indicate that the TILLING approach in this species was successful at providing new mutations and can address the major challenge of linking sequence information to biological function and also the identification of novel variation for crop breeding.

  13. The W22 genome: a foundation for maize functional genomics and transposon biology

    Science.gov (United States)

    The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using small-read sequencing technologies. We show that significant structural heterogeneity exists in ...

  14. MicroScope-an integrated resource for community expertise of gene functions and comparative analysis of microbial genomic and metabolic data.

    Science.gov (United States)

    Médigue, Claudine; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Gautreau, Guillaume; Josso, Adrien; Lajus, Aurélie; Langlois, Jordan; Pereira, Hugo; Planel, Rémi; Roche, David; Rollin, Johan; Rouy, Zoe; Vallenet, David

    2017-09-12

    The overwhelming list of new bacterial genomes becoming available on a daily basis makes accurate genome annotation an essential step that ultimately determines the relevance of thousands of genomes stored in public databanks. The MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Starting from the results of our syntactic, functional and relational annotation pipelines, MicroScope provides an integrated environment for the expert annotation and comparative analysis of prokaryotic genomes. It combines tools and graphical interfaces to analyze genomes and to perform the manual curation of gene function in a comparative genomics and metabolic context. In this article, we describe the free-of-charge MicroScope services for the annotation and analysis of microbial (meta)genomes, transcriptomic and re-sequencing data. Then, the functionalities of the platform are presented in a way providing practical guidance and help to the nonspecialists in bioinformatics. Newly integrated analysis tools (i.e. prediction of virulence and resistance genes in bacterial genomes) and original method recently developed (the pan-genome graph representation) are also described. Integrated environments such as MicroScope clearly contribute, through the user community, to help maintaining accurate resources. © The Author 2017. Published by Oxford University Press.

  15. A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

    Science.gov (United States)

    Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

    2018-01-01

    Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/

  16. Single-Cell-Based Platform for Copy Number Variation Profiling through Digital Counting of Amplified Genomic DNA Fragments.

    Science.gov (United States)

    Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi

    2017-04-26

    We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.

  17. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo: genome assembly and analysis.

    Directory of Open Access Journals (Sweden)

    Rami A Dalloul

    2010-09-01

    Full Text Available A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo. Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.

  18. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Directory of Open Access Journals (Sweden)

    Wenning Zheng

    Full Text Available The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC tool and Pathogenomic Profiling Tool (PathoProT, which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  19. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    Science.gov (United States)

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  20. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Lincoln D Stein

    2003-11-01

    Full Text Available The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp and C. elegans (100.3 Mbp genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C

  1. Genome projects and the functional-genomic era.

    Science.gov (United States)

    Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

    2005-12-01

    The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.

  2. Platform comparison for evaluation of ALK protein immunohistochemical expression, genomic copy number and hotspot mutation status in neuroblastomas.

    Directory of Open Access Journals (Sweden)

    Benedict Yan

    Full Text Available ALK is an established causative oncogenic driver in neuroblastoma, and is likely to emerge as a routine biomarker in neuroblastoma diagnostics. At present, the optimal strategy for clinical diagnostic evaluation of ALK protein, genomic and hotspot mutation status is not well-studied. We evaluated ALK immunohistochemical (IHC protein expression using three different antibodies (ALK1, 5A4 and D5F3 clones, ALK genomic status using single-color chromogenic in situ hybridization (CISH, and ALK hotspot mutation status using conventional Sanger sequencing and a next-generation sequencing platform (Ion Torrent Personal Genome Machine (IT-PGM, in archival formalin-fixed, paraffin-embedded neuroblastoma samples. We found a significant difference in IHC results using the three different antibodies, with the highest percentage of positive cases seen on D5F3 immunohistochemistry. Correlation with ALK genomic and hotspot mutational status revealed that the majority of D5F3 ALK-positive cases did not possess either ALK genomic amplification or hotspot mutations. Comparison of sequencing platforms showed a perfect correlation between conventional Sanger and IT-PGM sequencing. Our findings suggest that D5F3 immunohistochemistry, single-color CISH and IT-PGM sequencing are suitable assays for evaluation of ALK status in future neuroblastoma clinical trials.

  3. RNAi-Based Functional Genomics Identifies New Virulence Determinants in Mucormycosis.

    Directory of Open Access Journals (Sweden)

    Trung Anh Trieu

    2017-01-01

    Full Text Available Mucorales are an emerging group of human pathogens that are responsible for the lethal disease mucormycosis. Unfortunately, functional studies on the genetic factors behind the virulence of these organisms are hampered by their limited genetic tractability, since they are reluctant to classical genetic tools like transposable elements or gene mapping. Here, we describe an RNAi-based functional genomic platform that allows the identification of new virulence factors through a forward genetic approach firstly described in Mucorales. This platform contains a whole-genome collection of Mucor circinelloides silenced transformants that presented a broad assortment of phenotypes related to the main physiological processes in fungi, including virulence, hyphae morphology, mycelial and yeast growth, carotenogenesis and asexual sporulation. Selection of transformants with reduced virulence allowed the identification of mcplD, which encodes a Phospholipase D, and mcmyo5, encoding a probably essential cargo transporter of the Myosin V family, as required for a fully virulent phenotype of M. circinelloides. Knock-out mutants for those genes showed reduced virulence in both Galleria mellonella and Mus musculus models, probably due to a delayed germination and polarized growth within macrophages. This study provides a robust approach to study virulence in Mucorales and as a proof of concept identified new virulence determinants in M. circinelloides that could represent promising targets for future antifungal therapies.

  4. Methods for open innovation on a genome-design platform associating scientific, commercial, and educational communities in synthetic biology.

    Science.gov (United States)

    Toyoda, Tetsuro

    2011-01-01

    Synthetic biology requires both engineering efficiency and compliance with safety guidelines and ethics. Focusing on the rational construction of biological systems based on engineering principles, synthetic biology depends on a genome-design platform to explore the combinations of multiple biological components or BIO bricks for quickly producing innovative devices. This chapter explains the differences among various platform models and details a methodology for promoting open innovation within the scope of the statutory exemption of patent laws. The detailed platform adopts a centralized evaluation model (CEM), computer-aided design (CAD) bricks, and a freemium model. It is also important for the platform to support the legal aspects of copyrights as well as patent and safety guidelines because intellectual work including DNA sequences designed rationally by human intelligence is basically copyrightable. An informational platform with high traceability, transparency, auditability, and security is required for copyright proof, safety compliance, and incentive management for open innovation in synthetic biology. GenoCon, which we have organized and explained here, is a competition-styled, open-innovation method involving worldwide participants from scientific, commercial, and educational communities that aims to improve the designs of genomic sequences that confer a desired function on an organism. Using only a Web browser, a participating contributor proposes a design expressed with CAD bricks that generate a relevant DNA sequence, which is then experimentally and intensively evaluated by the GenoCon organizers. The CAD bricks that comprise programs and databases as a Semantic Web are developed, executed, shared, reused, and well stocked on the secure Semantic Web platform called the Scientists' Networking System or SciNetS/SciNeS, based on which a CEM research center for synthetic biology and open innovation should be established. Copyright © 2011 Elsevier Inc

  5. Enabling a community to dissect an organism: overview of the Neurospora functional genomics project.

    Science.gov (United States)

    Dunlap, Jay C; Borkovich, Katherine A; Henn, Matthew R; Turner, Gloria E; Sachs, Matthew S; Glass, N Louise; McCluskey, Kevin; Plamann, Michael; Galagan, James E; Birren, Bruce W; Weiss, Richard L; Townsend, Jeffrey P; Loros, Jennifer J; Nelson, Mary Anne; Lambreghts, Randy; Colot, Hildur V; Park, Gyungsoon; Collopy, Patrick; Ringelberg, Carol; Crew, Christopher; Litvinkova, Liubov; DeCaprio, Dave; Hood, Heather M; Curilla, Susan; Shi, Mi; Crawford, Matthew; Koerhsen, Michael; Montgomery, Phil; Larson, Lisa; Pearson, Matthew; Kasuga, Takao; Tian, Chaoguang; Baştürkmen, Meray; Altamirano, Lorena; Xu, Junhuan

    2007-01-01

    A consortium of investigators is engaged in a functional genomics project centered on the filamentous fungus Neurospora, with an eye to opening up the functional genomic analysis of all the filamentous fungi. The overall goal of the four interdependent projects in this effort is to accomplish functional genomics, annotation, and expression analyses of Neurospora crassa, a filamentous fungus that is an established model for the assemblage of over 250,000 species of non yeast fungi. Building from the completely sequenced 43-Mb Neurospora genome, Project 1 is pursuing the systematic disruption of genes through targeted gene replacements, phenotypic analysis of mutant strains, and their distribution to the scientific community at large. Project 2, through a primary focus in Annotation and Bioinformatics, has developed a platform for electronically capturing community feedback and data about the existing annotation, while building and maintaining a database to capture and display information about phenotypes. Oligonucleotide-based microarrays created in Project 3 are being used to collect baseline expression data for the nearly 11,000 distinguishable transcripts in Neurospora under various conditions of growth and development, and eventually to begin to analyze the global effects of loss of novel genes in strains created by Project 1. cDNA libraries generated in Project 4 document the overall complexity of expressed sequences in Neurospora, including alternative splicing alternative promoters and antisense transcripts. In addition, these studies have driven the assembly of an SNP map presently populated by nearly 300 markers that will greatly accelerate the positional cloning of genes.

  6. [Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

    Science.gov (United States)

    Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

    2017-08-01

    To analyze and detect the whole genome sequence of human mitochondrial DNA (mtDNA) by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine

  7. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

    Science.gov (United States)

    Gerlt, John A

    2017-08-22

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

  8. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  9. Development of a Rhizoctonia solani AG1-IB Specific Gene Model Enables Comparative Genome Analyses between Phytopathogenic R. solani AG1-IA, AG1-IB, AG3 and AG8 Isolates.

    Directory of Open Access Journals (Sweden)

    Daniel Wibberg

    Full Text Available Rhizoctonia solani, a soil-born plant pathogenic basidiomycetous fungus, affects various economically important agricultural and horticultural crops. The draft genome sequence for the R. solani AG1-IB isolate 7/3/14 as well as a corresponding transcriptome dataset (Expressed Sequence Tags--ESTs were established previously. Development of a specific R. solani AG1-IB gene model based on GMAP transcript mapping within the eukaryotic gene prediction platform AUGUSTUS allowed detection of new genes and provided insights into the gene structure of this fungus. In total, 12,616 genes were recognized in the genome of the AG1-IB isolate. Analysis of predicted genes by means of different bioinformatics tools revealed new genes whose products potentially are involved in degradation of plant cell wall components, melanin formation and synthesis of secondary metabolites. Comparative genome analyses between members of different R. solani anastomosis groups, namely AG1-IA, AG3 and AG8 and the newly annotated R. solani AG1-IB genome were performed within the comparative genomics platform EDGAR. It appeared that only 21 to 28% of all genes encoded in the draft genomes of the different strains were identified as core genes. Based on Average Nucleotide Identity (ANI and Average Amino-acid Identity (AAI analyses, considerable sequence differences between isolates representing different anastomosis groups were identified. However, R. solani isolates form a distinct cluster in relation to other fungi of the phylum Basidiomycota. The isolate representing AG1-IB encodes significant more genes featuring predictable functions in secondary metabolite production compared to other completely sequenced R. solani strains. The newly established R. solani AG1-IB 7/3/14 gene layout now provides a reliable basis for post-genomics studies.

  10. Assembly of the Lactuca sativa, L. cv. Tizian draft genome sequence reveals differences within major resistance complex 1 as compared to the cv. Salinas reference genome.

    Science.gov (United States)

    Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2018-02-10

    Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform, Version 1.5 and 1.x.

    Energy Technology Data Exchange (ETDEWEB)

    2017-05-18

    EDGE bioinformatics was developed to help biologists process Next Generation Sequencing data (in the form of raw FASTQ files), even if they have little to no bioinformatics expertise. EDGE is a highly integrated and interactive web-based platform that is capable of running many of the standard analyses that biologists require for viral, bacterial/archaeal, and metagenomic samples. EDGE provides the following analytical workflows: quality trimming and host removal, assembly and annotation, comparisons against known references, taxonomy classification of reads and contigs, whole genome SNP-based phylogenetic analysis, and PCR analysis. EDGE provides an intuitive web-based interface for user input, allows users to visualize and interact with selected results (e.g. JBrowse genome browser), and generates a final detailed PDF report. Results in the form of tables, text files, graphic files, and PDFs can be downloaded. A user management system allows tracking of an individual’s EDGE runs, along with the ability to share, post publicly, delete, or archive their results.

  12. Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis.

    Science.gov (United States)

    Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

    2016-02-18

    A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment.

  13. Functional annotation of rare gene aberration drivers of pancreatic cancer | Office of Cancer Genomics

    Science.gov (United States)

    As we enter the era of precision medicine, characterization of cancer genomes will directly influence therapeutic decisions in the clinic. Here we describe a platform enabling functionalization of rare gene mutations through their high-throughput construction, molecular barcoding and delivery to cancer models for in vivo tumour driver screens. We apply these technologies to identify oncogenic drivers of pancreatic ductal adenocarcinoma (PDAC).

  14. MycoCAP - Mycobacterium Comparative Analysis Platform.

    Science.gov (United States)

    Choo, Siew Woh; Ang, Mia Yang; Dutta, Avirup; Tan, Shi Yang; Siow, Cheuk Chuen; Heydari, Hamed; Mutha, Naresh V R; Wee, Wei Yee; Wong, Guat Jah

    2015-12-15

    Mycobacterium spp. are renowned for being the causative agent of diseases like leprosy, Buruli ulcer and tuberculosis in human beings. With more and more mycobacterial genomes being sequenced, any knowledge generated from comparative genomic analysis would provide better insights into the biology, evolution, phylogeny and pathogenicity of this genus, thus helping in better management of diseases caused by Mycobacterium spp.With this motivation, we constructed MycoCAP, a new comparative analysis platform dedicated to the important genus Mycobacterium. This platform currently provides information of 2108 genome sequences of at least 55 Mycobacterium spp. A number of intuitive web-based tools have been integrated in MycoCAP particularly for comparative analysis including the PGC tool for comparison between two genomes, PathoProT for comparing the virulence genes among the Mycobacterium strains and the SuperClassification tool for the phylogenic classification of the Mycobacterium strains and a specialized classification system for strains of Mycobacterium abscessus. We hope the broad range of functions and easy-to-use tools provided in MycoCAP makes it an invaluable analysis platform to speed up the research discovery on mycobacteria for researchers. Database URL: http://mycobacterium.um.edu.my.

  15. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  16. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  17. Genome-scale metabolic models as platforms for strain design and biological discovery.

    Science.gov (United States)

    Mienda, Bashir Sajo

    2017-07-01

    Genome-scale metabolic models (GEMs) have been developed and used in guiding systems' metabolic engineering strategies for strain design and development. This strategy has been used in fermentative production of bio-based industrial chemicals and fuels from alternative carbon sources. However, computer-aided hypotheses building using established algorithms and software platforms for biological discovery can be integrated into the pipeline for strain design strategy to create superior strains of microorganisms for targeted biosynthetic goals. Here, I described an integrated workflow strategy using GEMs for strain design and biological discovery. Specific case studies of strain design and biological discovery using Escherichia coli genome-scale model are presented and discussed. The integrated workflow presented herein, when applied carefully would help guide future design strategies for high-performance microbial strains that have existing and forthcoming genome-scale metabolic models.

  18. antiSMASH 2.0-a versatile platform for genome mining of secondary metabolite producers

    NARCIS (Netherlands)

    Blin, Kai; Medema, Marnix H.; Kazempour, Daniyal; Fischbach, Michael A.; Breitling, Rainer; Takano, Eriko; Weber, Tilmann

    Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that

  19. Multi-platform whole-genome microarray analyses refine the epigenetic signature of breast cancer metastasis with gene expression and copy number.

    Directory of Open Access Journals (Sweden)

    Joseph Andrews

    2010-01-01

    Full Text Available We have previously identified genome-wide DNA methylation changes in a cell line model of breast cancer metastasis. These complex epigenetic changes that we observed, along with concurrent karyotype analyses, have led us to hypothesize that complex genomic alterations in cancer cells (deletions, translocations and ploidy are superimposed over promoter-specific methylation events that are responsible for gene-specific expression changes observed in breast cancer metastasis.We undertook simultaneous high-resolution, whole-genome analyses of MDA-MB-468GFP and MDA-MB-468GFP-LN human breast cancer cell lines (an isogenic, paired lymphatic metastasis cell line model using Affymetrix gene expression (U133, promoter (1.0R, and SNP/CNV (SNP 6.0 microarray platforms to correlate data from gene expression, epigenetic (DNA methylation, and combination copy number variant/single nucleotide polymorphism microarrays. Using Partek Software and Ingenuity Pathway Analysis we integrated datasets from these three platforms and detected multiple hypomethylation and hypermethylation events. Many of these epigenetic alterations correlated with gene expression changes. In addition, gene dosage events correlated with the karyotypic differences observed between the cell lines and were reflected in specific promoter methylation patterns. Gene subsets were identified that correlated hyper (and hypo methylation with the loss (or gain of gene expression and in parallel, with gene dosage losses and gains, respectively. Individual gene targets from these subsets were also validated for their methylation, expression and copy number status, and susceptible gene pathways were identified that may indicate how selective advantage drives the processes of tumourigenesis and metastasis.Our approach allows more precisely profiling of functionally relevant epigenetic signatures that are associated with cancer progression and metastasis.

  20. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

    Science.gov (United States)

    Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya; Guigó, Roderic; Gingeras, Thomas R; Margulies, Elliott H; Weng, Zhiping; Snyder, Michael; Dermitzakis, Emmanouil T; Thurman, Robert E; Kuehn, Michael S; Taylor, Christopher M; Neph, Shane; Koch, Christoph M; Asthana, Saurabh; Malhotra, Ankit; Adzhubei, Ivan; Greenbaum, Jason A; Andrews, Robert M; Flicek, Paul; Boyle, Patrick J; Cao, Hua; Carter, Nigel P; Clelland, Gayle K; Davis, Sean; Day, Nathan; Dhami, Pawandeep; Dillon, Shane C; Dorschner, Michael O; Fiegler, Heike; Giresi, Paul G; Goldy, Jeff; Hawrylycz, Michael; Haydock, Andrew; Humbert, Richard; James, Keith D; Johnson, Brett E; Johnson, Ericka M; Frum, Tristan T; Rosenzweig, Elizabeth R; Karnani, Neerja; Lee, Kirsten; Lefebvre, Gregory C; Navas, Patrick A; Neri, Fidencio; Parker, Stephen C J; Sabo, Peter J; Sandstrom, Richard; Shafer, Anthony; Vetrie, David; Weaver, Molly; Wilcox, Sarah; Yu, Man; Collins, Francis S; Dekker, Job; Lieb, Jason D; Tullius, Thomas D; Crawford, Gregory E; Sunyaev, Shamil; Noble, William S; Dunham, Ian; Denoeud, France; Reymond, Alexandre; Kapranov, Philipp; Rozowsky, Joel; Zheng, Deyou; Castelo, Robert; Frankish, Adam; Harrow, Jennifer; Ghosh, Srinka; Sandelin, Albin; Hofacker, Ivo L; Baertsch, Robert; Keefe, Damian; Dike, Sujit; Cheng, Jill; Hirsch, Heather A; Sekinger, Edward A; Lagarde, Julien; Abril, Josep F; Shahab, Atif; Flamm, Christoph; Fried, Claudia; Hackermüller, Jörg; Hertel, Jana; Lindemeyer, Manja; Missal, Kristin; Tanzer, Andrea; Washietl, Stefan; Korbel, Jan; Emanuelsson, Olof; Pedersen, Jakob S; Holroyd, Nancy; Taylor, Ruth; Swarbreck, David; Matthews, Nicholas; Dickson, Mark C; Thomas, Daryl J; Weirauch, Matthew T; Gilbert, James; Drenkow, Jorg; Bell, Ian; Zhao, XiaoDong; Srinivasan, K G; Sung, Wing-Kin; Ooi, Hong Sain; Chiu, Kuo Ping; Foissac, Sylvain; Alioto, Tyler; Brent, Michael; Pachter, Lior; Tress, Michael L; Valencia, Alfonso; Choo, Siew Woh; Choo, Chiou Yu; Ucla, Catherine; Manzano, Caroline; Wyss, Carine; Cheung, Evelyn; Clark, Taane G; Brown, James B; Ganesh, Madhavan; Patel, Sandeep; Tammana, Hari; Chrast, Jacqueline; Henrichsen, Charlotte N; Kai, Chikatoshi; Kawai, Jun; Nagalakshmi, Ugrappa; Wu, Jiaqian; Lian, Zheng; Lian, Jin; Newburger, Peter; Zhang, Xueqing; Bickel, Peter; Mattick, John S; Carninci, Piero; Hayashizaki, Yoshihide; Weissman, Sherman; Hubbard, Tim; Myers, Richard M; Rogers, Jane; Stadler, Peter F; Lowe, Todd M; Wei, Chia-Lin; Ruan, Yijun; Struhl, Kevin; Gerstein, Mark; Antonarakis, Stylianos E; Fu, Yutao; Green, Eric D; Karaöz, Ulaş; Siepel, Adam; Taylor, James; Liefer, Laura A; Wetterstrand, Kris A; Good, Peter J; Feingold, Elise A; Guyer, Mark S; Cooper, Gregory M; Asimenos, George; Dewey, Colin N; Hou, Minmei; Nikolaev, Sergey; Montoya-Burgos, Juan I; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Huang, Haiyan; Zhang, Nancy R; Holmes, Ian; Mullikin, James C; Ureta-Vidal, Abel; Paten, Benedict; Seringhaus, Michael; Church, Deanna; Rosenbloom, Kate; Kent, W James; Stone, Eric A; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross C; Haussler, David; Miller, Webb; Sidow, Arend; Trinklein, Nathan D; Zhang, Zhengdong D; Barrera, Leah; Stuart, Rhona; King, David C; Ameur, Adam; Enroth, Stefan; Bieda, Mark C; Kim, Jonghwan; Bhinge, Akshay A; Jiang, Nan; Liu, Jun; Yao, Fei; Vega, Vinsensius B; Lee, Charlie W H; Ng, Patrick; Shahab, Atif; Yang, Annie; Moqtaderi, Zarmik; Zhu, Zhou; Xu, Xiaoqin; Squazzo, Sharon; Oberley, Matthew J; Inman, David; Singer, Michael A; Richmond, Todd A; Munn, Kyle J; Rada-Iglesias, Alvaro; Wallerman, Ola; Komorowski, Jan; Fowler, Joanna C; Couttet, Phillippe; Bruce, Alexander W; Dovey, Oliver M; Ellis, Peter D; Langford, Cordelia F; Nix, David A; Euskirchen, Ghia; Hartman, Stephen; Urban, Alexander E; Kraus, Peter; Van Calcar, Sara; Heintzman, Nate; Kim, Tae Hoon; Wang, Kun; Qu, Chunxu; Hon, Gary; Luna, Rosa; Glass, Christopher K; Rosenfeld, M Geoff; Aldred, Shelley Force; Cooper, Sara J; Halees, Anason; Lin, Jane M; Shulha, Hennady P; Zhang, Xiaoling; Xu, Mousheng; Haidar, Jaafar N S; Yu, Yong; Ruan, Yijun; Iyer, Vishwanath R; Green, Roland D; Wadelius, Claes; Farnham, Peggy J; Ren, Bing; Harte, Rachel A; Hinrichs, Angie S; Trumbower, Heather; Clawson, Hiram; Hillman-Jackson, Jennifer; Zweig, Ann S; Smith, Kayla; Thakkapallayil, Archana; Barber, Galt; Kuhn, Robert M; Karolchik, Donna; Armengol, Lluis; Bird, Christine P; de Bakker, Paul I W; Kern, Andrew D; Lopez-Bigas, Nuria; Martin, Joel D; Stranger, Barbara E; Woodroffe, Abigail; Davydov, Eugene; Dimas, Antigone; Eyras, Eduardo; Hallgrímsdóttir, Ingileif B; Huppert, Julian; Zody, Michael C; Abecasis, Gonçalo R; Estivill, Xavier; Bouffard, Gerard G; Guan, Xiaobin; Hansen, Nancy F; Idol, Jacquelyn R; Maduro, Valerie V B; Maskeri, Baishali; McDowell, Jennifer C; Park, Morgan; Thomas, Pamela J; Young, Alice C; Blakesley, Robert W; Muzny, Donna M; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Jiang, Huaiyang; Weinstock, George M; Gibbs, Richard A; Graves, Tina; Fulton, Robert; Mardis, Elaine R; Wilson, Richard K; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B; Chang, Jean L; Lindblad-Toh, Kerstin; Lander, Eric S; Koriabine, Maxim; Nefedov, Mikhail; Osoegawa, Kazutoyo; Yoshinaga, Yuko; Zhu, Baoli; de Jong, Pieter J

    2007-06-14

    We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

  1. Simultaneous and complete genome sequencing of influenza A and B with high coverage by Illumina MiSeq Platform.

    Science.gov (United States)

    Rutvisuttinunt, Wiriya; Chinnawirotpisan, Piyawan; Simasathien, Sriluck; Shrestha, Sanjaya K; Yoon, In-Kyu; Klungthong, Chonticha; Fernandez, Stefan

    2013-11-01

    Active global surveillance and characterization of influenza viruses are essential for better preparation against possible pandemic events. Obtaining comprehensive information about the influenza genome can improve our understanding of the evolution of influenza viruses and emergence of new strains, and improve the accuracy when designing preventive vaccines. This study investigated the use of deep sequencing by the next-generation sequencing (NGS) Illumina MiSeq Platform to obtain complete genome sequence information from influenza virus isolates. The influenza virus isolates were cultured from 6 respiratory acute clinical specimens collected in Thailand and Nepal. DNA libraries obtained from each viral isolate were mixed and all were sequenced simultaneously. Total information of 2.6 Gbases was obtained from a 455±14 K/mm2 density with 95.76% (8,571,655/8,950,724 clusters) of the clusters passing quality control (QC) filters. Approximately 93.7% of all sequences from Read1 and 83.5% from Read2 contained high quality sequences that were ≥Q30, a base calling QC score standard. Alignments analysis identified three seasonal influenza A H3N2 strains, one 2009 pandemic influenza A H1N1 strain and two influenza B strains. The nearly entire genomes of all six virus isolates yielded equal or greater than 600-fold sequence coverage depth. MiSeq Platform identified seasonal influenza A H3N2, 2009 pandemic influenza A H1N1and influenza B in the DNA library mixtures efficiently. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  2. Web Apollo: a web-based genomic annotation editing platform.

    Science.gov (United States)

    Lee, Eduardo; Helt, Gregg A; Reese, Justin T; Munoz-Torres, Monica C; Childers, Chris P; Buels, Robert M; Stein, Lincoln; Holmes, Ian H; Elsik, Christine G; Lewis, Suzanna E

    2013-08-30

    Web Apollo is the first instantaneous, collaborative genomic annotation editor available on the web. One of the natural consequences following from current advances in sequencing technology is that there are more and more researchers sequencing new genomes. These researchers require tools to describe the functional features of their newly sequenced genomes. With Web Apollo researchers can use any of the common browsers (for example, Chrome or Firefox) to jointly analyze and precisely describe the features of a genome in real time, whether they are in the same room or working from opposite sides of the world.

  3. IVAG: An Integrative Visualization Application for Various Types of Genomic Data Based on R-Shiny and the Docker Platform.

    Science.gov (United States)

    Lee, Tae-Rim; Ahn, Jin Mo; Kim, Gyuhee; Kim, Sangsoo

    2017-12-01

    Next-generation sequencing (NGS) technology has become a trend in the genomics research area. There are many software programs and automated pipelines to analyze NGS data, which can ease the pain for traditional scientists who are not familiar with computer programming. However, downstream analyses, such as finding differentially expressed genes or visualizing linkage disequilibrium maps and genome-wide association study (GWAS) data, still remain a challenge. Here, we introduce a dockerized web application written in R using the Shiny platform to visualize pre-analyzed RNA sequencing and GWAS data. In addition, we have integrated a genome browser based on the JBrowse platform and an automated intermediate parsing process required for custom track construction, so that users can easily build and navigate their personal genome tracks with in-house datasets. This application will help scientists perform series of downstream analyses and obtain a more integrative understanding about various types of genomic data by interactively visualizing them with customizable options.

  4. Genome-wide resequencing of KRICE_CORE reveals their potential for future breeding, as well as functional and evolutionary studies in the post-genomic era.

    Science.gov (United States)

    Kim, Tae-Sung; He, Qiang; Kim, Kyu-Won; Yoon, Min-Young; Ra, Won-Hee; Li, Feng Peng; Tong, Wei; Yu, Jie; Oo, Win Htet; Choi, Buung; Heo, Eun-Beom; Yun, Byoung-Kook; Kwon, Soon-Jae; Kwon, Soon-Wook; Cho, Yoo-Hyun; Lee, Chang-Yong; Park, Beom-Seok; Park, Yong-Jin

    2016-05-26

    Rice germplasm collections continue to grow in number and size around the world. Since maintaining and screening such massive resources remains challenging, it is important to establish practical methods to manage them. A core collection, by definition, refers to a subset of the entire population that preserves the majority of genetic diversity, enhancing the efficiency of germplasm utilization. Here, we report whole-genome resequencing of the 137 rice mini core collection or Korean rice core set (KRICE_CORE) that represents 25,604 rice germplasms deposited in the Korean genebank of the Rural Development Administration (RDA). We implemented the Illumina HiSeq 2000 and 2500 platform to produce short reads and then assembled those with 9.8 depths using Nipponbare as a reference. Comparisons of the sequences with the reference genome yielded more than 15 million (M) single nucleotide polymorphisms (SNPs) and 1.3 M INDELs. Phylogenetic and population analyses using 2,046,529 high-quality SNPs successfully assigned rice accessions to the relevant rice subgroups, suggesting that these SNPs capture evolutionary signatures that have accumulated in rice subpopulations. Furthermore, genome-wide association studies (GWAS) for four exemplary agronomic traits in the KRIC_CORE manifest the utility of KRICE_CORE; that is, identifying previously defined genes or novel genetic factors that potentially regulate important phenotypes. This study provides strong evidence that the size of KRICE_CORE is small but contains high genetic and functional diversity across the genome. Thus, our resequencing results will be useful for future breeding, as well as functional and evolutionary studies, in the post-genomic era.

  5. Clonal expansion of genome-intact HIV-1 in functionally polarized Th1 CD4+ T cells.

    Science.gov (United States)

    Lee, Guinevere Q; Orlova-Fink, Nina; Einkauf, Kevin; Chowdhury, Fatema Z; Sun, Xiaoming; Harrington, Sean; Kuo, Hsiao-Hsuan; Hua, Stephane; Chen, Hsiao-Rong; Ouyang, Zhengyu; Reddy, Kavidha; Dong, Krista; Ndung'u, Thumbi; Walker, Bruce D; Rosenberg, Eric S; Yu, Xu G; Lichterfeld, Mathias

    2017-06-30

    HIV-1 causes a chronic, incurable disease due to its persistence in CD4+ T cells that contain replication-competent provirus, but exhibit little or no active viral gene expression and effectively resist combination antiretroviral therapy (cART). These latently infected T cells represent an extremely small proportion of all circulating CD4+ T cells but possess a remarkable long-term stability and typically persist throughout life, for reasons that are not fully understood. Here we performed massive single-genome, near-full-length next-generation sequencing of HIV-1 DNA derived from unfractionated peripheral blood mononuclear cells, ex vivo-isolated CD4+ T cells, and subsets of functionally polarized memory CD4+ T cells. This approach identified multiple sets of independent, near-full-length proviral sequences from cART-treated individuals that were completely identical, consistent with clonal expansion of CD4+ T cells harboring intact HIV-1. Intact, near-full-genome HIV-1 DNA sequences that were derived from such clonally expanded CD4+ T cells constituted 62% of all analyzed genome-intact sequences in memory CD4 T cells, were preferentially observed in Th1-polarized cells, were longitudinally detected over a duration of up to 5 years, and were fully replication- and infection-competent. Together, these data suggest that clonal proliferation of Th1-polarized CD4+ T cells encoding for intact HIV-1 represents a driving force for stabilizing the pool of latently infected CD4+ T cells.

  6. Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals.

    Science.gov (United States)

    Su, Fei; Xu, Ping

    2014-01-29

    Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species.

  7. Finding function: evaluation methods for functional genomic data

    Directory of Open Access Journals (Sweden)

    Barrett Daniel R

    2006-07-01

    Full Text Available Abstract Background Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications. Results We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website. Conclusion Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion.

  8. A multi-platform draft de novo genome assembly and comparative analysis for the Scarlet Macaw (Ara macao.

    Directory of Open Access Journals (Sweden)

    Christopher M Seabury

    Full Text Available Data deposition to NCBI Genomes: This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly. The version described in this paper is the first version (AMXX01000000. The scaffolded assembly (SMACv1.1 has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000. Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw. Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb includes more than 997 Mb of unambiguous sequence data (excluding N's. Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7, which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity which were independently supported by the results of previous human GWAS

  9. NU-IN: Nucleotide evolution and input module for the EvolSimulator genome simulation platform

    Directory of Open Access Journals (Sweden)

    Barker Michael S

    2010-08-01

    Full Text Available Abstract Background There is increasing demand to test hypotheses that contrast the evolution of genes and gene families among genomes, using simulations that work across these levels of organization. The EvolSimulator program was developed recently to provide a highly flexible platform for forward simulations of amino acid evolution in multiple related lineages of haploid genomes, permitting copy number variation and lateral gene transfer. Synonymous nucleotide evolution is not currently supported, however, and would be highly advantageous for comparisons to full genome, transcriptome, and single nucleotide polymorphism (SNP datasets. In addition, EvolSimulator creates new genomes for each simulation, and does not allow the input of user-specified sequences and gene family information, limiting the incorporation of further biological realism and/or user manipulations of the data. Findings We present modified C++ source code for the EvolSimulator platform, which we provide as the extension module NU-IN. With NU-IN, synonymous and non-synonymous nucleotide evolution is fully implemented, and the user has the ability to use real or previously-simulated sequence data to initiate a simulation of one or more lineages. Gene family membership can be optionally specified, as well as gene retention probabilities that model biased gene retention. We provide PERL scripts to assist the user in deriving this information from previous simulations. We demonstrate the features of NU-IN by simulating genome duplication (polyploidy in the presence of ongoing copy number variation in an evolving lineage. This example is initiated with real genomic data, and produces output that we analyse directly with existing bioinformatic pipelines. Conclusions The NU-IN extension module is a publicly available open source software (GNU GPLv3 license extension to EvolSimulator. With the NU-IN module, users are now able to simulate both drift and selection at the nucleotide

  10. Enabling systematic interrogation of protein-protein interactions in live cells with a versatile ultra-high-throughput biosensor platform | Office of Cancer Genomics

    Science.gov (United States)

    The vast datasets generated by next generation gene sequencing and expression profiling have transformed biological and translational research. However, technologies to produce large-scale functional genomics datasets, such as high-throughput detection of protein-protein interactions (PPIs), are still in early development. While a number of powerful technologies have been employed to detect PPIs, a singular PPI biosensor platform featured with both high sensitivity and robustness in a mammalian cell environment remains to be established.

  11. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis

    2016-01-01

    and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services...... and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services...

  12. Genomics With Cloud Computing

    OpenAIRE

    Sukhamrit Kaur; Sandeep Kaur

    2015-01-01

    Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computin...

  13. In vivo genome editing of the albumin locus as a platform for protein replacement therapy

    Science.gov (United States)

    Sharma, Rajiv; Anguela, Xavier M.; Doyon, Yannick; Wechsler, Thomas; DeKelver, Russell C.; Sproul, Scott; Paschon, David E.; Miller, Jeffrey C.; Davidson, Robert J.; Shivak, David; Zhou, Shangzhen; Rieders, Julianne; Gregory, Philip D.; Holmes, Michael C.; Rebar, Edward J.

    2015-01-01

    Site-specific genome editing provides a promising approach for achieving long-term, stable therapeutic gene expression. Genome editing has been successfully applied in a variety of preclinical models, generally focused on targeting the diseased locus itself; however, limited targeting efficiency or insufficient expression from the endogenous promoter may impede the translation of these approaches, particularly if the desired editing event does not confer a selective growth advantage. Here we report a general strategy for liver-directed protein replacement therapies that addresses these issues: zinc finger nuclease (ZFN) –mediated site-specific integration of therapeutic transgenes within the albumin gene. By using adeno-associated viral (AAV) vector delivery in vivo, we achieved long-term expression of human factors VIII and IX (hFVIII and hFIX) in mouse models of hemophilia A and B at therapeutic levels. By using the same targeting reagents in wild-type mice, lysosomal enzymes were expressed that are deficient in Fabry and Gaucher diseases and in Hurler and Hunter syndromes. The establishment of a universal nuclease-based platform for secreted protein production would represent a critical advance in the development of safe, permanent, and functional cures for diverse genetic and nongenetic diseases. PMID:26297739

  14. Genome-wide identification, sequence characterization, and protein-protein interaction properties of DDB1 (damaged DNA binding protein-1)-binding WD40-repeat family members in Solanum lycopersicum.

    Science.gov (United States)

    Zhu, Yunye; Huang, Shengxiong; Miao, Min; Tang, Xiaofeng; Yue, Junyang; Wang, Wenjie; Liu, Yongsheng

    2015-06-01

    One hundred DDB1 (damaged DNA binding protein-1)-binding WD40-repeat domain (DWD) family genes were identified in the S. lycopersicum genome. The DWD genes encode proteins presumably functioning as the substrate recognition subunits of the cullin4-ring ubiquitin E3 ligase complex. These findings provide candidate genes and a research platform for further gene functionality and molecular breeding study. A subclass of DDB1 (damaged DNA binding protein-1)-binding WD40-repeat domain (DWD) family proteins has been demonstrated to function as the substrate recognition subunits of the cullin4-ring ubiquitin E3 ligase complex. However, little information is available about the cognate subfamily genes in tomato (S. lycopersicum). In this study, based on the recently released tomato genome sequences, 100 tomato genes encoding DWD proteins that potentially interact with DDB1 were identified and characterized, including analyses of the detailed annotations, chromosome locations and compositions of conserved amino acid domains. In addition, a phylogenetic tree, which comprises of three main groups, of the subfamily genes was constructed. The physical interaction between tomato DDB1 and 14 representative DWD proteins was determined by yeast two-hybrid and co-immunoprecipitation assays. The subcellular localization of these 14 representative DWD proteins was determined. Six of them were localized in both nucleus and cytoplasm, seven proteins exclusively in cytoplasm, and one protein either in nucleus and cytoplasm, or exclusively in cytoplasm. Comparative genomic analysis demonstrated that the expansion of these subfamily members in tomato predominantly resulted from two whole-genome triplication events in the evolution history.

  15. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery

    DEFF Research Database (Denmark)

    Hickey, John M.; Chiurugwi, Tinashe; Mackay, Ian

    2017-01-01

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human...... that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying...... use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform...

  16. Mapping genomic features to functional traits through microbial whole genome sequences.

    Science.gov (United States)

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

  17. HelmCoP: an online resource for helminth functional genomics and drug and vaccine targets prioritization.

    Directory of Open Access Journals (Sweden)

    Sahar Abubucker

    Full Text Available A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes. Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention, built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html.

  18. GeneDig: a web application for accessing genomic and bioinformatics knowledge.

    Science.gov (United States)

    Suciu, Radu M; Aydin, Emir; Chen, Brian E

    2015-02-28

    With the exponential increase and widespread availability of genomic, transcriptomic, and proteomic data, accessing these '-omics' data is becoming increasingly difficult. The current resources for accessing and analyzing these data have been created to perform highly specific functions intended for specialists, and thus typically emphasize functionality over user experience. We have developed a web-based application, GeneDig.org, that allows any general user access to genomic information with ease and efficiency. GeneDig allows for searching and browsing genes and genomes, while a dynamic navigator displays genomic, RNA, and protein information simultaneously for co-navigation. We demonstrate that our application allows more than five times faster and efficient access to genomic information than any currently available methods. We have developed GeneDig as a platform for bioinformatics integration focused on usability as its central design. This platform will introduce genomic navigation to broader audiences while aiding the bioinformatics analyses performed in everyday biology research.

  19. Genomic Comparison among Lethal Invasive Strains of Streptococcus pyogenes Serotype M1

    Directory of Open Access Journals (Sweden)

    Gabriel R. Fernandes

    2017-10-01

    Full Text Available Streptococcus pyogenes, also known as group A Streptococcus (GAS, is a human pathogen that causes diverse human diseases including streptococcal toxic shock syndrome (STSS. A GAS outbreak occurred in Brasilia, Brazil, during the second half of the year 2011, causing 26 deaths. Whole genome sequencing was performed using Illumina platform. The sequences were assembled and genes were predicted for comparative analysis with emm type 1 strains: MGAS5005 and M1 GAS. Genomics comparison revealed one of the invasive strains that differ from others isolates and from emm 1 reference genomes. Also, the new invasive strain showed differences in the content of virulence factors compared to other isolated in the same outbreak. The evolution of contemporary GAS strains is strongly associated with horizontal gene transfer. This is the first genomic study of a Streptococcal emm 1 outbreak in Brazil, and revealed the rapid bacterial evolution leading to new clones. The emergence of new invasive strains can be a consequence of the injudicious use of antibiotics in Brazil during the past decades.

  20. New developments of RNAi in Paracoccidioides brasiliensis: prospects for high-throughput, genome-wide, functional genomics.

    Directory of Open Access Journals (Sweden)

    Tercio Goes

    2014-10-01

    Full Text Available The Fungal Genome Initiative of the Broad Institute, in partnership with the Paracoccidioides research community, has recently sequenced the genome of representative isolates of this human-pathogen dimorphic fungus: Pb18 (S1, Pb03 (PS2 and Pb01. The accomplishment of future high-throughput, genome-wide, functional genomics will rely upon appropriate molecular tools and straightforward techniques to streamline the generation of stable loss-of-function phenotypes. In the past decades, RNAi has emerged as the most robust genetic technique to modulate or to suppress gene expression in diverse eukaryotes, including fungi. These molecular tools and techniques, adapted for RNAi, were up until now unavailable for P. brasiliensis.In this paper, we report Agrobacterium tumefaciens mediated transformation of yeast cells for high-throughput applications with which higher transformation frequencies of 150±24 yeast cell transformants per 1×106 viable yeast cells were obtained. Our approach is based on a bifunctional selective marker fusion protein consisted of the Streptoalloteichus hindustanus bleomycin-resistance gene (Shble and the intrinsically fluorescent monomeric protein mCherry which was codon-optimized for heterologous expression in P. brasiliensis. We also report successful GP43 gene knock-down through the expression of intron-containing hairpin RNA (ihpRNA from a Gateway-adapted cassette (cALf which was purpose-built for gene silencing in a high-throughput manner. Gp43 transcript levels were reduced by 73.1±22.9% with this approach.We have a firm conviction that the genetic transformation technique and the molecular tools herein described will have a relevant contribution in future Paracoccidioides spp. functional genomics research.

  1. The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies.

    Science.gov (United States)

    Argout, X; Martin, G; Droc, G; Fouet, O; Labadie, K; Rivals, E; Aury, J M; Lanaud, C

    2017-09-15

    Theobroma cacao L., native to the Amazonian basin of South America, is an economically important fruit tree crop for tropical countries as a source of chocolate. The first draft genome of the species, from a Criollo cultivar, was published in 2011. Although a useful resource, some improvements are possible, including identifying misassemblies, reducing the number of scaffolds and gaps, and anchoring un-anchored sequences to the 10 chromosomes. We used a NGS-based approach to significantly improve the assembly of the Belizian Criollo B97-61/B2 genome. We combined four Illumina large insert size mate paired libraries with 52x of Pacific Biosciences long reads to correct misassembled regions and reduced the number of scaffolds. We then used genotyping by sequencing (GBS) methods to increase the proportion of the assembly anchored to chromosomes. The scaffold number decreased from 4,792 in assembly V1 to 554 in V2 while the scaffold N50 size has increased from 0.47 Mb in V1 to 6.5 Mb in V2. A total of 96.7% of the assembly was anchored to the 10 chromosomes compared to 66.8% in the previous version. Unknown sites (Ns) were reduced from 10.8% to 5.7%. In addition, we updated the functional annotations and performed a new RefSeq structural annotation based on RNAseq evidence. Theobroma cacao Criollo genome version 2 will be a valuable resource for the investigation of complex traits at the genomic level and for future comparative genomics and genetics studies in cacao tree. New functional tools and annotations are available on the Cocoa Genome Hub ( http://cocoa-genome-hub.southgreen.fr ).

  2. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery.

    Science.gov (United States)

    Hickey, John M; Chiurugwi, Tinashe; Mackay, Ian; Powell, Wayne

    2017-08-30

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying approach to deliver innovative 'step changes' in the rate of genetic gain at scale.

  3. Genomics With Cloud Computing

    Directory of Open Access Journals (Sweden)

    Sukhamrit Kaur

    2015-04-01

    Full Text Available Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computing to genomics are like easy access and sharing of data security of data less cost to pay for resources but still there are some demerits like large time needed to transfer data less network bandwidth.

  4. Functional nanostructured platforms for chemical and biological sensing

    Science.gov (United States)

    Létant, S. E.

    2006-05-01

    The central goal of our work is to combine semiconductor nanotechnology and surface functionalization in order to build platforms for the selective detection of bio-organisms ranging in size from bacteria (micron range) down to viruses, as well as for the detection of chemical agents (nanometer range). We will show on three porous silicon platforms how pore geometry and pore wall chemistry can be combined and optimized to capture and detect specific targets. We developed a synthetic route allowing to directly anchor proteins on silicon surfaces and illustrated the relevance of this technique by immobilizing live enzymes onto electrochemically etched luminescent nano-porous silicon. The powerful association of the specific enzymes with the transducing matrix led to a selective hybrid platform for chemical sensing. We also used light-assisted electrochemistry to produce periodic arrays of through pores on pre-patterned silicon membranes with controlled diameters ranging from many microns down to tens of nanometers. We demonstrated the first covalently functionalized silicon membranes and illustrated their selective capture abilities with antibody-coated micro-beads. These engineered membranes are extremely versatile and could be adapted to specifically recognize the external fingerprints (size and coat composition) of target bio-organisms. Finally, we fabricated locally functionalized single nanopores using a combination of focused ion beam drilling and ion beam assisted oxide deposition. We showed how a silicon oxide ring can be grown around a single nanopore and how it can be functionalized with DNA probes to detect single viral-sized beads. The next step for this platform is the detection of whole viruses and bacteria.

  5. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  6. Defining functional DNA elements in the human genome

    Science.gov (United States)

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  7. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes1[OPEN

    Science.gov (United States)

    Wang, Jun; Tao, Feng; Marowsky, Nicholas C.; Fan, Chuanzhu

    2016-01-01

    Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella. Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. PMID:27485883

  8. Standardized analysis and sharing of genome-phenome data for neuromuscular and rare disease research through the RD-Connect platform

    OpenAIRE

    Thompson, Rachel; Beltran, Sergi; Papakonstantinou, Anastasios; Cañada, Andrés; Fernández, Jose Maria; Thompson, Mark; Kaliyaperumal, Rajaram; Lair, Séverine; Sernadela, Pedro; Girdea, Marta; Brudno, Michael; Blavier, André; Lochmüller, Hanns; Roos, Andreas; Straub, Volker

    2016-01-01

    Abstract: RD-Connect (rd-connect.eu) is an EU-funded project building an integrated platform to narrow the gaps in rare disease research, where patient populations, clinical expertise and research communities are small in number and highly fragmented. Guided by the needs of rare disease researchers and with neuromuscular and neurodegenerative researchers as its original collaborators, the RD-Connect platform securely integrates multiple types of omics data (genomics, proteomics and transcript...

  9. Genomic islands predict functional adaptation in marine actinobacteria

    Energy Technology Data Exchange (ETDEWEB)

    Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel; Gontang, Erin; McGlinchey, Ryan; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric; Moore, Bradley; Jensen, Paul

    2009-04-01

    Linking functional traits to bacterial phylogeny remains a fundamental but elusive goal of microbial ecology 1. Without this information, it becomes impossible to resolve meaningful units of diversity and the mechanisms by which bacteria interact with each other and adapt to environmental change. Ecological adaptations among bacterial populations have been linked to genomic islands, strain-specific regions of DNA that house functionally adaptive traits 2. In the case of environmental bacteria, these traits are largely inferred from bioinformatic or gene expression analyses 2, thus leaving few examples in which the functions of island genes have been experimentally characterized. Here we report the complete genome sequences of Salinispora tropica and S. arenicola, the first cultured, obligate marine Actinobacteria 3. These two species inhabit benthic marine environments and dedicate 8-10percent of their genomes to the biosynthesis of secondary metabolites. Despite a close phylogenetic relationship, 25 of 37 secondary metabolic pathways are species-specific and located within 21 genomic islands, thus providing new evidence linking secondary metabolism to ecological adaptation. Species-specific differences are also observed in CRISPR sequences, suggesting that variations in phage immunity provide fitness advantages that contribute to the cosmopolitan distribution of S. arenicola 4. The two Salinispora genomes have evolved by complex processes that include the duplication and acquisition of secondary metabolite genes, the products of which provide immediate opportunities for molecular diversification and ecological adaptation. Evidence that secondary metabolic pathways are exchanged by Horizontal Gene Transfer (HGT) yet are fixed among globally distributed populations 5 supports a functional role for their products and suggests that pathway acquisition represents a previously unrecognized force driving bacterial diversification

  10. Copper-free click-chemistry platform to functionalize cisplatin prodrugs.

    Science.gov (United States)

    Pathak, Rakesh K; McNitt, Christopher D; Popik, Vladimir V; Dhar, Shanta

    2014-06-02

    The ability to rationally design and construct a platform technology to develop new platinum(IV) [Pt(IV)] prodrugs with functionalities for installation of targeting moieties, delivery systems, fluorescent reporters from a single precursor with the ability to release biologically active cisplatin by using well-defined chemistry is critical for discovering new platinum-based therapeutics. With limited numbers of possibilities considering the sensitivity of Pt(IV) centers, we used a strain-promoted azide-alkyne cycloaddition approach to provide a platform, in which new functionalities can easily be installed on cisplatin prodrugs from a single Pt(IV) precursor. The ability of this platform to be incorporated in nanodelivery vehicle and conjugation to fluorescent reporters were also investigated. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. FGWAS: Functional genome wide association analysis.

    Science.gov (United States)

    Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu

    2017-10-01

    Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Genome-wide siRNA-based functional genomics of pigmentation identifies novel genes and pathways that impact melanogenesis in human cells.

    Directory of Open Access Journals (Sweden)

    Anand K Ganesan

    2008-12-01

    Full Text Available Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo, neurologic disorders (Parkinson's disease, auditory disorders (Waardenburg's syndrome, and opthalmologic disorders (age related macular degeneration. Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype

  13. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

    DEFF Research Database (Denmark)

    Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya

    2007-01-01

    We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses...

  14. The Functional Genomics Initiative at Oak Ridge National Laboratory

    Energy Technology Data Exchange (ETDEWEB)

    Johnson, Dabney; Justice, Monica; Beattle, Ken; Buchanan, Michelle; Ramsey, Michael; Ramsey, Rose; Paulus, Michael; Ericson, Nance; Allison, David; Kress, Reid; Mural, Richard; Uberbacher, Ed; Mann, Reinhold

    1997-12-31

    The Functional Genomics Initiative at the Oak Ridge National Laboratory integrates outstanding capabilities in mouse genetics, bioinformatics, and instrumentation. The 50 year investment by the DOE in mouse genetics/mutagenesis has created a one-of-a-kind resource for generating mutations and understanding their biological consequences. It is generally accepted that, through the mouse as a surrogate for human biology, we will come to understand the function of human genes. In addition to this world class program in mammalian genetics, ORNL has also been a world leader in developing bioinformatics tools for the analysis, management and visualization of genomic data. Combining this expertise with new instrumentation technologies will provide a unique capability to understand the consequences of mutations in the mouse at both the organism and molecular levels. The goal of the Functional Genomics Initiative is to develop the technology and methodology necessary to understand gene function on a genomic scale and apply these technologies to megabase regions of the human genome. The effort is scoped so as to create an effective and powerful resource for functional genomics. ORNL is partnering with the Joint Genome Institute and other large scale sequencing centers to sequence several multimegabase regions of both human and mouse genomic DNA, to identify all the genes in these regions, and to conduct fundamental surveys to examine gene function at the molecular and organism level. The Initiative is designed to be a pilot for larger scale deployment in the post-genome era. Technologies will be applied to the examination of gene expression and regulation, metabolism, gene networks, physiology and development.

  15. Functional genomics of beer-related physiological processes in yeast

    NARCIS (Netherlands)

    Hazelwood, L.A.

    2009-01-01

    Since the release of the entire genome sequence of the S. cerevisiae laboratory strain S288C in 1996, many functional genomics tools have been introduced in fundamental and application-oriented yeast research. In this thesis, the applicability of functional genomics for the improvement of yeast in

  16. Gain-of-function mutagenesis approaches in rice for functional genomics and improvement of crop productivity.

    Science.gov (United States)

    Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Dutta, Mouboni; Kirti, P B

    2017-07-01

    The epitome of any genome research is to identify all the existing genes in a genome and investigate their roles. Various techniques have been applied to unveil the functions either by silencing or over-expressing the genes by targeted expression or random mutagenesis. Rice is the most appropriate model crop for generating a mutant resource for functional genomic studies because of the availability of high-quality genome sequence and relatively smaller genome size. Rice has syntenic relationships with members of other cereals. Hence, characterization of functionally unknown genes in rice will possibly provide key genetic insights and can lead to comparative genomics involving other cereals. The current review attempts to discuss the available gain-of-function mutagenesis techniques for functional genomics, emphasizing the contemporary approach, activation tagging and alterations to this method for the enhancement of yield and productivity of rice. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  17. MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics

    Directory of Open Access Journals (Sweden)

    Hardy Nigel

    2006-06-01

    Full Text Available Abstract Background The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5–6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions. Description MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/. Conclusion The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.

  18. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  19. RTEL1 maintains genomic stability by suppressing homologous recombination.

    Science.gov (United States)

    Barber, Louise J; Youds, Jillian L; Ward, Jordan D; McIlwraith, Michael J; O'Neil, Nigel J; Petalcorin, Mark I R; Martin, Julie S; Collis, Spencer J; Cantor, Sharon B; Auclair, Melissa; Tissenbaum, Heidi; West, Stephen C; Rose, Ann M; Boulton, Simon J

    2008-10-17

    Homologous recombination (HR) is an important conserved process for DNA repair and ensures maintenance of genome integrity. Inappropriate HR causes gross chromosomal rearrangements and tumorigenesis in mammals. In yeast, the Srs2 helicase eliminates inappropriate recombination events, but the functional equivalent of Srs2 in higher eukaryotes has been elusive. Here, we identify C. elegans RTEL-1 as a functional analog of Srs2 and describe its vertebrate counterpart, RTEL1, which is required for genome stability and tumor avoidance. We find that rtel-1 mutant worms and RTEL1-depleted human cells share characteristic phenotypes with yeast srs2 mutants: lethality upon deletion of the sgs1/BLM homolog, hyperrecombination, and DNA damage sensitivity. In vitro, purified human RTEL1 antagonizes HR by promoting the disassembly of D loop recombination intermediates in a reaction dependent upon ATP hydrolysis. We propose that loss of HR control after deregulation of RTEL1 may be a critical event that drives genome instability and cancer.

  20. Current development and application of soybean genomics

    Institute of Scientific and Technical Information of China (English)

    Lingli HE; Jing ZHAO; Man ZHAO; Chaoying HE

    2011-01-01

    Soybean (Glycine max),an important domesticated species originated in China,constitutes a major source of edible oils and high-quality plant proteins worldwide.In spite of its complex genome as a consequence of an ancient tetraploidilization,platforms for map-based genomics,sequence-based genomics,comparative genomics and functional genomics have been well developed in the last decade,thus rich repertoires of genomic tools and resources are available,which have been influencing the soybean genetic improvement.Here we mainly review the progresses of soybean (including its wild relative Glycine soja) genomics and its impetus for soybean breeding,and raise the major biological questions needing to be addressed.Genetic maps,physical maps,QTL and EST mapping have been so well achieved that the marker assisted selection and positional cloning in soybean is feasible and even routine.Whole genome sequencing and transcriptomic analyses provide a large collection of molecular markers and predicted genes,which are instrumental to comparative genomics and functional genomics.Comparative genomics has started to reveal the evolution of soybean genome and the molecular basis of soybean domestication process.Microarrays resources,mutagenesis and efficient transformation systems become essential components of soybean functional genomics.Furthermore,phenotypic functional genomics via both forward and reverse genetic approaches has inferred functions of many genes involved in plant and seed development,in response to abiotic stresses,functioning in plant-pathogenic microbe interactions,and controlling the oil and protein content of seed.These achievements have paved the way for generation of transgenic or genetically modified (GM) soybean crops.

  1. Determining and comparing protein function in Bacterial genome sequences

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla

    of this class have very little homology to other known genomes making functional annotation based on sequence similarity very difficult. Inspired in part by this analysis, an approach for comparative functional annotation was created based public sequenced genomes, CMGfunc. Functionally related groups......In November 2013, there was around 21.000 different prokaryotic genomes sequenced and publicly available, and the number is growing daily with another 20.000 or more genomes expected to be sequenced and deposited by the end of 2014. An important part of the analysis of this data is the functional...... annotation of genes – the descriptions assigned to genes that describe the likely function of the encoded proteins. This process is limited by several factors, including the definition of a function which can be more or less specific as well as how many genes can actually be assigned a function based...

  2. Characterizing genomic alterations in cancer by complementary functional associations.

    Science.gov (United States)

    Kim, Jong Wook; Botvinnik, Olga B; Abudayyeh, Omar; Birger, Chet; Rosenbluh, Joseph; Shrestha, Yashaswi; Abazeed, Mohamed E; Hammerman, Peter S; DiCara, Daniel; Konieczkowski, David J; Johannessen, Cory M; Liberzon, Arthur; Alizad-Rahvar, Amir Reza; Alexe, Gabriela; Aguirre, Andrew; Ghandi, Mahmoud; Greulich, Heidi; Vazquez, Francisca; Weir, Barbara A; Van Allen, Eliezer M; Tsherniak, Aviad; Shao, Diane D; Zack, Travis I; Noble, Michael; Getz, Gad; Beroukhim, Rameen; Garraway, Levi A; Ardakani, Masoud; Romualdi, Chiara; Sales, Gabriele; Barbie, David A; Boehm, Jesse S; Hahn, William C; Mesirov, Jill P; Tamayo, Pablo

    2016-05-01

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.

  3. Gene Expression Analysis of Escherichia Coli Grown in Miniaturized Bioreactor Platforms for High-Throughput Analysis of Growth and genomic Data

    DEFF Research Database (Denmark)

    Boccazzi, P.; Zanzotto, A.; Szita, Nicolas

    2005-01-01

    Combining high-throughput growth physiology and global gene expression data analysis is of significant value for integrating metabolism and genomics. We compared global gene expression using 500 ng of total RNA from Escherichia coli cultures grown in rich or defined minimal media in a miniaturize...... cultures using just 500 ng of total RNA indicate that high-throughput integration of growth physiology and genomics will be possible with novel biochemical platforms and improved detection technologies....

  4. A Parvovirus B19 synthetic genome: sequence features and functional competence.

    Science.gov (United States)

    Manaresi, Elisabetta; Conti, Ilaria; Bua, Gloria; Bonvicini, Francesca; Gallinella, Giorgio

    2017-08-01

    Central to genetic studies for Parvovirus B19 (B19V) is the availability of genomic clones that may possess functional competence and ability to generate infectious virus. In our study, we established a new model genetic system for Parvovirus B19. A synthetic approach was followed, by design of a reference genome sequence, by generation of a corresponding artificial construct and its molecular cloning in a complete and functional form, and by setup of an efficient strategy to generate infectious virus, via transfection in UT7/EpoS1 cells and amplification in erythroid progenitor cells. The synthetic genome was able to generate virus with biological properties paralleling those of native virus, its infectious activity being dependent on the preservation of self-complementarity and sequence heterogeneity within the terminal regions. A virus of defined genome sequence, obtained from controlled cell culture conditions, can constitute a reference tool for investigation of the structural and functional characteristics of the virus. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Genome engineering via TALENs and CRISPR/Cas9 systems: challenges and perspectives

    KAUST Repository

    Mahfouz, Magdy M.

    2014-09-24

    The ability to precisely modify genome sequence and regulate gene expression patterns in a site-specific manner holds much promise in plant biotechnology. Genome-engineering technologies that enable such highly specific and efficient modification are advancing with unprecedented pace. Transcription activator-like effectors (TALEs) provide customizable DNA-binding modules designed to bind to any sequence of interest. Thus, TALEs have been used as a DNA targeting module fused to functional domains for a variety of targeted genomic and epigenomic modifications. TALE nucleases (TALENs) have been used with much success across eukaryotic species to edit genomes. Recently, clustered regularly interspaced palindromic repeats (CRISPRs) that are used as guide RNAs for Cas9 nuclease-specific digestion has been introduced as a highly efficient DNA-targeting platform for genome editing and regulation. Here, we review the discovery, development and limitations of TALENs and CRIPSR/Cas9 systems as genome-engineering platforms in plants. We discuss the current questions, potential improvements and the development of the next-generation genome-editing platforms with an emphasis on producing designer plants to address the needs of agriculture and basic plant biology.

  6. Genome engineering via TALENs and CRISPR/Cas9 systems: challenges and perspectives

    KAUST Repository

    Mahfouz, Magdy M.; Piatek, Agnieszka Anna; Stewart, Charles Neal

    2014-01-01

    The ability to precisely modify genome sequence and regulate gene expression patterns in a site-specific manner holds much promise in plant biotechnology. Genome-engineering technologies that enable such highly specific and efficient modification are advancing with unprecedented pace. Transcription activator-like effectors (TALEs) provide customizable DNA-binding modules designed to bind to any sequence of interest. Thus, TALEs have been used as a DNA targeting module fused to functional domains for a variety of targeted genomic and epigenomic modifications. TALE nucleases (TALENs) have been used with much success across eukaryotic species to edit genomes. Recently, clustered regularly interspaced palindromic repeats (CRISPRs) that are used as guide RNAs for Cas9 nuclease-specific digestion has been introduced as a highly efficient DNA-targeting platform for genome editing and regulation. Here, we review the discovery, development and limitations of TALENs and CRIPSR/Cas9 systems as genome-engineering platforms in plants. We discuss the current questions, potential improvements and the development of the next-generation genome-editing platforms with an emphasis on producing designer plants to address the needs of agriculture and basic plant biology.

  7. Gene Content and Virtual Gene Order of Barley Chromosome 1H1,[C],[W],[OA

    Czech Academy of Sciences Publication Activity Database

    Mayer, K. F. X.; Taudien, S.; Martis, M.; Šimková, Hana; Suchánková, Pavla; Gundlach, H.; Wicker, T.; Petzold, A.; Felder, M.; Steuernagel, B.; Scholz, U.; Graner, A.; Platzer, M.; Doležel, Jaroslav; Stein, N.

    2009-01-01

    Roč. 151, č. 2 (2009), s. 496-505 ISSN 0032-0889 R&D Projects: GA MŠk(CZ) LC06004 Institutional research plan: CEZ:AV0Z50380511 Keywords : WHOLE-GENOME AMPLIFICATION * RICE GENOME * DNA Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 6.235, year: 2009

  8. A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.

    Science.gov (United States)

    Vandepoele, Klaas

    2017-01-01

    PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .

  9. Comparison of gene coverage of mouse oligonucleotide microarray platforms

    Directory of Open Access Journals (Sweden)

    Medrano Juan F

    2006-03-01

    Full Text Available Abstract Background The increasing use of DNA microarrays for genetical genomics studies generates a need for platforms with complete coverage of the genome. We have compared the effective gene coverage in the mouse genome of different commercial and noncommercial oligonucleotide microarray platforms by performing an in-house gene annotation of probes. We only used information about probes that is available from vendors and followed a process that any researcher may take to find the gene targeted by a given probe. In order to make consistent comparisons between platforms, probes in each microarray were annotated with an Entrez Gene id and the chromosomal position for each gene was obtained from the UCSC Genome Browser Database. Gene coverage was estimated as the percentage of Entrez Genes with a unique position in the UCSC Genome database that is tested by a given microarray platform. Results A MySQL relational database was created to store the mapping information for 25,416 mouse genes and for the probes in five microarray platforms (gene coverage level in parenthesis: Affymetrix430 2.0 (75.6%, ABI Genome Survey (81.24%, Agilent (79.33%, Codelink (78.09%, Sentrix (90.47%; and four array-ready oligosets: Sigma (47.95%, Operon v.3 (69.89%, Operon v.4 (84.03%, and MEEBO (84.03%. The differences in coverage between platforms were highly conserved across chromosomes. Differences in the number of redundant and unspecific probes were also found among arrays. The database can be queried to compare specific genomic regions using a web interface. The software used to create, update and query the database is freely available as a toolbox named ArrayGene. Conclusion The software developed here allows researchers to create updated custom databases by using public or proprietary information on genes for any organisms. ArrayGene allows easy comparisons of gene coverage between microarray platforms for any region of the genome. The comparison presented here

  10. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes.

    Science.gov (United States)

    Vallenet, David; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Lajus, Aurélie; Josso, Adrien; Mercier, Jonathan; Renaux, Alexandre; Rollin, Johan; Rouy, Zoe; Roche, David; Scarpelli, Claude; Médigue, Claudine

    2017-01-04

    The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Complete genome sequence of Halorhodospira halophila SL1

    Energy Technology Data Exchange (ETDEWEB)

    Challacombe, Jean F [ORNL; Majid, Sophia [University of Chicago; Deole, Ratnakar [Oklahoma State University; Brettin, Thomas S. [Argonne National Laboratory (ANL); Bruce, David [Los Alamos National Laboratory (LANL); Delano, Susana [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Gleasner, Cheryl D. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Misra, Monica [Los Alamos National Laboratory (LANL); Reitenga, Krista K. [Los Alamos National Laboratory (LANL); Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Hoff, Wouter D. [Oklahoma State University

    2013-01-01

    Halorhodospira halophila is among the most halophilic organisms known. It is an obligately photosynthetic and anaerobic purple sulfur bacterium that exhibits autotrophic growth up to saturated NaCl concentrations. The type strain H. halophila SL1 was isolated from a hypersaline lake in Oregon. Here we report the determination of its entire genome in a single contig. This is the first genome of a phototrophic extreme halophile. The genome consists of 2,678,452 bp, encoding 2493 predicted genes as determined by automated genome annotation. Of the 2407 predicted proteins, 1905 were assigned to a putative function. Future detailed analysis of this genome promises to yield insights into the halophilic adaptations of this organism, its ability for photoautotrophic growth under extreme conditions, and its characteristic sulfur metabolism.

  12. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  13. Resurrection of DNA function in vivo from an extinct genome.

    Directory of Open Access Journals (Sweden)

    Andrew J Pask

    2008-05-01

    Full Text Available There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine, obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity.

  14. From proteomics to systems biology: MAPA, MASS WESTERN, PROMEX, and COVAIN as a user-oriented platform.

    Science.gov (United States)

    Weckwerth, Wolfram; Wienkoop, Stefanie; Hoehenwarter, Wolfgang; Egelhofer, Volker; Sun, Xiaoliang

    2014-01-01

    Genome sequencing and systems biology are revolutionizing life sciences. Proteomics emerged as a fundamental technique of this novel research area as it is the basis for gene function analysis and modeling of dynamic protein networks. Here a complete proteomics platform suited for functional genomics and systems biology is presented. The strategy includes MAPA (mass accuracy precursor alignment; http://www.univie.ac.at/mosys/software.html ) as a rapid exploratory analysis step; MASS WESTERN for targeted proteomics; COVAIN ( http://www.univie.ac.at/mosys/software.html ) for multivariate statistical analysis, data integration, and data mining; and PROMEX ( http://www.univie.ac.at/mosys/databases.html ) as a database module for proteogenomics and proteotypic peptides for targeted analysis. Moreover, the presented platform can also be utilized to integrate metabolomics and transcriptomics data for the analysis of metabolite-protein-transcript correlations and time course analysis using COVAIN. Examples for the integration of MAPA and MASS WESTERN data, proteogenomic and metabolic modeling approaches for functional genomics, phosphoproteomics by integration of MOAC (metal-oxide affinity chromatography) with MAPA, and the integration of metabolomics, transcriptomics, proteomics, and physiological data using this platform are presented. All software and step-by-step tutorials for data processing and data mining can be downloaded from http://www.univie.ac.at/mosys/software.html.

  15. Clinical Implications of Human Population Differences in Genome-wide Rates of Functional Genotypes

    Directory of Open Access Journals (Sweden)

    Ali eTorkamani

    2012-11-01

    Full Text Available There have been a number of recent successes in the use of whole genome sequencing and sophisticated bioinformatics techniques to identify pathogenic DNA sequence variants responsible for individual idiopathic congenital conditions. However, the success of this identification process is heavily influenced by the ancestry or genetic background of a patient with an idiopathic condition. This is so because potential pathogenic variants in a patient’s genome must be contrasted with variants in a reference set of genomes made up of other individuals’ genomes of the same ancestry as the patient. We explored the effect of ignoring the ancestries of both an individual patient and the individuals used to construct reference genomes. We pursued this exploration in two major steps. We first considered variation in the per-genome number and rates likely functional derived (i.e., non-ancestral, based on the chimp genome single nucleotide variants and small indels in 52 individual whole human genomes sampled from 10 different global populations. We took advantage of a suite of computational and bioinformatics techniques to predict the functional effect of over 24 million genomic variants, both coding and non-coding, across these genomes. We found that the typical human genome harbors ~5.5-6.1 million total derived variants, of which ~12,000 are likely to have a functional effect (~5000 coding and ~7000 non-coding. We also found that the rates of functional genotypes per the total number of genotypes in individual whole genomes differ dramatically between human populations. We then created tables showing how the use of comparator or reference genome panels comprised of genomes from individuals that do not have the same ancestral background as a patient can negatively impact pathogenic variant identification. Our results have important implications for clinical sequencing initiatives.

  16. Sequencing and Analysis of the Pseudomonas fluorescens GcM5-1A Genome: A Pathogen Living in the Surface Coat of Bursaphelenchus xylophilus.

    Directory of Open Access Journals (Sweden)

    Kai Feng

    Full Text Available It is known that several bacteria are adherent to the surface coat of pine wood nematode (Bursaphelenchus xylophilus, but their function and role in the pathogenesis of pine wilt disease remains debatable. The Pseudomonas fluorescens GcM5-1A is a bacterium isolated from the surface coat of pine wood nematodes. In previous studies, GcM5-1A was evident in connection with the pathogenicity of pine wilt disease. In this study, we report the de novo sequencing of the GcM5-1A genome. A 600-Mb collection of high-quality reads was obtained and assembled into sequence contigs spanning a 6.01-Mb length. Sequence annotation predicted 5,413 open reading frames, of which 2,988 were homologous to genes in the other four sequenced P. fluorescens isolates (SBW25, WH6, Pf0-1 and Pf-5 and 1,137 were unique to GcM5-1A. Phylogenetic studies and genome comparison revealed that GcM5-1A is more closely related to SBW25 and WH6 isolates than to Pf0-1 and Pf-5 isolates. Towards study of pathogenesis, we identified 79 candidate virulence factors in the genome of GcM5-1A, including the Alg, Fl, Waa gene families, and genes coding the major pathogenic protein fliC. In addition, genes for a complete T3SS system were identified in the genome of GcM5-1A. Such systems have proved to play a critical role in subverting and colonizing the host organisms of many gram-negative pathogenic bacteria. Although the functions of the candidate virulence factors need yet to be deciphered experimentally, the availability of this genome provides a basic platform to obtain informative clues to be addressed in future studies by the pine wilt disease research community.

  17. Genome Sequence of Australian Indigenous Wine Yeast Torulaspora delbrueckii COFT1 Using Nanopore Sequencing.

    Science.gov (United States)

    Tondini, Federico; Jiranek, Vladimir; Grbin, Paul R; Onetto, Cristobal A

    2018-04-26

    Here, we report the first sequenced genome of an indigenous Australian wine isolate of Torulaspora delbrueckii using the Oxford Nanopore MinION and Illumina HiSeq sequencing platforms. The genome size is 9.4 Mb and contains 4,831 genes. Copyright © 2018 Tondini et al.

  18. Long Read Alignment with Parallel MapReduce Cloud Platform

    Directory of Open Access Journals (Sweden)

    Ahmed Abdulhakim Al-Absi

    2015-01-01

    Full Text Available Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner’s Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.

  19. Long Read Alignment with Parallel MapReduce Cloud Platform

    Science.gov (United States)

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887

  20. Comparative analyses identified species-specific functional roles in oral microbial genomes

    Science.gov (United States)

    Chen, Tsute; Gajare, Prasad; Olsen, Ingar; Dewhirst, Floyd E.

    2017-01-01

    ABSTRACT The advent of next generation sequencing is producing more genomic sequences for various strains of many human oral microbial species and allows for insightful functional comparisons at both intra- and inter-species levels. This study performed in-silico functional comparisons for currently available genomic sequences of major species associated with periodontitis including Aggregatibacter actinomycetemcomitans (AA), Porphyromonas gingivalis (PG), Treponema denticola (TD), and Tannerella forsythia (TF), as well as several cariogenic and commensal streptococcal species. Complete or draft sequences were annotated with the RAST to infer structured functional subsystems for each genome. The subsystems profiles were clustered to groups of functions with similar patterns. Functional enrichment and depletion were evaluated based on hypergeometric distribution to identify subsystems that are unique or missing between two groups of genomes. Unique or missing metabolic pathways and biological functions were identified in different species. For example, components involved in flagellar motility were found only in the motile species TD, as expected, with few exceptions scattered in several streptococcal species, likely associated with chemotaxis. Transposable elements were only found in the two Bacteroidales species PG and TF, and half of the AA genomes. Genes involved in CRISPR were prevalent in most oral species. Furthermore, prophage related subsystems were also commonly found in most species except for PG and Streptococcus mutans, in which very few genomes contain prophage components. Comparisons between pathogenic (P) and nonpathogenic (NP) genomes also identified genes potentially important for virulence. Two such comparisons were performed between AA (P) and several A. aphrophilus (NP) strains, and between S. mutans + S. sobrinus (P) and other oral streptococcal species (NP). This comparative genomics approach can be readily used to identify functions unique to

  1. GeNemo: a search engine for web-based functional genomic data.

    Science.gov (United States)

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-08

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. SPAR1/RTEL1 maintains genomic stability by suppressing homologous recombination

    Science.gov (United States)

    Barber, Louise J.; Youds, Jillian L.; Ward, Jordan D.; McIlwraith, Michael J.; O’Neil, Nigel J.; Petalcorin, Mark I.R.; Martin, Julie S.; Collis, Spencer J.; Cantor, Sharon B.; Auclair, Melissa; Tissenbaum, Heidi; West, Stephen C.; Rose, Ann M.; Boulton, Simon J.

    2013-01-01

    SUMMARY Inappropriate homologous recombination (HR) can cause gross chromosomal rearrangements that in mammalian cells may lead to tumorigenesis. In yeast, the Srs2 protein is an anti-recombinase that eliminates inappropriate recombination events, but the functional equivalent of Srs2 in higher eukaryotes has proven to be elusive. In this work, we identify C. elegans SPAR-1 as a functional analogue of Srs2 and describe its vertebrate counterpart, SPAR1/RTEL1, which is required for genome stability and tumour avoidance. We find that spar-1 mutant worms and SPAR1 knockdown human cells share characteristic phenotypes with yeast srs2 mutants, including inviability upon deletion of the sgs1/BLM homologue, hyper-recombination, and DNA damage sensitivity. In vitro, purified human SPAR1 antagonises HR by promoting the disassembly of D loop recombination intermediates in a reaction dependent upon ATP hydrolysis. We propose that loss of HR control following deregulation of SPAR1/RTEL1 may be a critical event that drives genome instability and cancer. PMID:18957201

  3. Chloroplast genome of Aconitum barbatum var. puberulum (Ranunculaceae) derived from CCS reads using the PacBio RS platform.

    Science.gov (United States)

    Chen, Xiaochen; Li, Qiushi; Li, Ying; Qian, Jun; Han, Jianping

    2015-01-01

    The chloroplast genome (cp genome) of Aconitum barbatum var. puberulum was sequenced using the third-generation sequencing platform based on the single-molecule real-time (SMRT) sequencing approach. To our knowledge, this is the first reported complete cp genome of Aconitum, and we anticipate that it will have great value for phylogenetic studies of the Ranunculaceae family. In total, 23,498 CCS reads and 20,685,462 base pairs were generated, the mean read length was 880 bp, and the longest read was 2,261 bp. Genome coverage of 100% was achieved with a mean coverage of 132× and no gaps. The accuracy of the assembled genome is 99.973%; the assembly was validated using Sanger sequencing of six selected genes from the cp genome. The complete cp genome of A. barbatum var. puberulum is 156,749 bp in length, including a large single-copy region of 87,630 bp and a small single-copy region of 16,941 bp separated by two inverted repeats of 26,089 bp. The cp genome contains 130 genes, including 84 protein-coding genes, 34 tRNA genes and eight rRNA genes. Four forward, five inverted and eight tandem repeats were identified. According to the SSR analysis, the longest poly structure is a 20-T repeat. Our results presented in this paper will facilitate the phylogenetic studies and molecular authentication on Aconitum.

  4. D-Amino acid oxidase bio-functionalized platforms: Toward an enhanced enzymatic bio-activity

    Science.gov (United States)

    Herrera, Elisa; Valdez Taubas, Javier; Giacomelli, Carla E.

    2015-11-01

    The purpose of this work is to study the adsorption process and surface bio-activity of His-tagged D-amino acid oxidase (DAAO) from Rhodotorula gracilis (His6-RgDAAO) as the first step for the development of an electrochemical bio-functionalized platform. With such a purpose this work comprises: (a) the His6-RgDAAO bio-activity in solution determined by amperometry, (b) the adsorption mechanism of His6-RgDAAO on bare gold and carboxylated modified substrates in the absence (substrate/COO-) and presence of Ni(II) (substrate/COO- + Ni(II)) determined by reflectometry, and (c) the bio-activity of the His6-RgDAAO bio-functionalized platforms determined by amperometry. Comparing the adsorption behavior and bio-activity of His6-RgDAAO on these different solid substrates allows understanding the contribution of the diverse interactions responsible for the platform performance. His6-RgDAAO enzymatic performance in solution is highly improved when compared to the previously used pig kidney (pk) DAAO. His6-RgDAAO exhibits an amperometrically detectable bio-activity at concentrations as low as those expected on a bio-functional platform; hence, it is a viable bio-recognition element of D-amino acids to be coupled to electrochemical platforms. Moreover, His6-RgDAAO bio-functionalized platforms exhibit a higher surface activity than pkDAAO physically adsorbed on gold. The platform built on Ni(II) modified substrates present enhanced bio-activity because the surface complexes histidine-Ni(II) provide with site-oriented, native-like enzymes. The adsorption mechanism responsible of the excellent performance of the bio-functionalized platform takes place in two steps involving electrostatic and bio-affinity interactions whose prevalence depends on the degree of surface coverage.

  5. Functional profiling of cyanobacterial genomes and its role in ecological adaptations

    Directory of Open Access Journals (Sweden)

    Ratna Prabha

    2016-09-01

    Full Text Available With the availability of complete genome sequences of many cyanobacterial species, it is becoming feasible to study the broad prospective of the environmental adaptation and the overall changes at transcriptional and translational level in these organisms. In the evolutionary phase, niche-specific competitive forces have resulted in specific features of the cyanobacterial genomes. In this study, functional composition of the 84 different cyanobacterial genomes and their adaptations to different environments was examined by identifying the genomic composition for specific cellular processes, which reflect their genomic functional profile and ecological adaptation. It was identified that among cyanobacterial genomes, metabolic genes have major share over other categories and differentiation of genomic functional profile was observed for the species inhabiting different habitats. The cyanobacteria of freshwater and other habitats accumulate large number of poorly characterized genes. Strain specific functions were also reported in many cyanobacterial members, of which an important feature was the occurrence of phage-related sequences. From this study, it can be speculated that habitat is one of the major factors in giving the shape of functional composition of cyanobacterial genomes towards their ecological adaptations.

  6. MicrobesFlux: a web platform for drafting metabolic models from the KEGG database

    Directory of Open Access Journals (Sweden)

    Feng Xueyang

    2012-08-01

    Full Text Available Abstract Background Concurrent with the efforts currently underway in mapping microbial genomes using high-throughput sequencing methods, systems biologists are building metabolic models to characterize and predict cell metabolisms. One of the key steps in building a metabolic model is using multiple databases to collect and assemble essential information about genome-annotations and the architecture of the metabolic network for a specific organism. To speed up metabolic model development for a large number of microorganisms, we need a user-friendly platform to construct metabolic networks and to perform constraint-based flux balance analysis based on genome databases and experimental results. Results We have developed a semi-automatic, web-based platform (MicrobesFlux for generating and reconstructing metabolic models for annotated microorganisms. MicrobesFlux is able to automatically download the metabolic network (including enzymatic reactions and metabolites of ~1,200 species from the KEGG database (Kyoto Encyclopedia of Genes and Genomes and then convert it to a metabolic model draft. The platform also provides diverse customized tools, such as gene knockouts and the introduction of heterologous pathways, for users to reconstruct the model network. The reconstructed metabolic network can be formulated to a constraint-based flux model to predict and analyze the carbon fluxes in microbial metabolisms. The simulation results can be exported in the SBML format (The Systems Biology Markup Language. Furthermore, we also demonstrated the platform functionalities by developing an FBA model (including 229 reactions for a recent annotated bioethanol producer, Thermoanaerobacter sp. strain X514, to predict its biomass growth and ethanol production. Conclusion MicrobesFlux is an installation-free and open-source platform that enables biologists without prior programming knowledge to develop metabolic models for annotated microorganisms in the KEGG

  7. A Comparison and Integration of MiSeq and MinION Platforms for Sequencing Single Source and Mixed Mitochondrial Genomes.

    Directory of Open Access Journals (Sweden)

    Michael R Lindberg

    Full Text Available Single source and multiple donor (mixed samples of human mitochondrial DNA were analyzed and compared using the MinION and the MiSeq platforms. A generalized variant detection strategy was employed to provide a cursory framework for evaluating the reliability and accuracy of mitochondrial sequences produced by the MinION. The feasibility of long-read phasing was investigated to establish its efficacy in quantitatively distinguishing and deconvolving individuals in a mixture. Finally, a proof-of-concept was demonstrated by integrating both platforms in a hybrid assembly that leverages solely mixture data to accurately reconstruct full mitochondrial genomes.

  8. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    Science.gov (United States)

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  9. Semantic Web repositories for genomics data using the eXframe platform.

    Science.gov (United States)

    Merrill, Emily; Corlosquet, Stéphane; Ciccarese, Paolo; Clark, Tim; Das, Sudeshna

    2014-01-01

    With the advent of inexpensive assay technologies, there has been an unprecedented growth in genomics data as well as the number of databases in which it is stored. In these databases, sample annotation using ontologies and controlled vocabularies is becoming more common. However, the annotation is rarely available as Linked Data, in a machine-readable format, or for standardized queries using SPARQL. This makes large-scale reuse, or integration with other knowledge bases very difficult. To address this challenge, we have developed the second generation of our eXframe platform, a reusable framework for creating online repositories of genomics experiments. This second generation model now publishes Semantic Web data. To accomplish this, we created an experiment model that covers provenance, citations, external links, assays, biomaterials used in the experiment, and the data collected during the process. The elements of our model are mapped to classes and properties from various established biomedical ontologies. Resource Description Framework (RDF) data is automatically produced using these mappings and indexed in an RDF store with a built-in Sparql Protocol and RDF Query Language (SPARQL) endpoint. Using the open-source eXframe software, institutions and laboratories can create Semantic Web repositories of their experiments, integrate it with heterogeneous resources and make it interoperable with the vast Semantic Web of biomedical knowledge.

  10. IMGMD: A platform for the integration and standardisation of In silico Microbial Genome-scale Metabolic Models.

    Science.gov (United States)

    Ye, Chao; Xu, Nan; Dong, Chuan; Ye, Yuannong; Zou, Xuan; Chen, Xiulai; Guo, Fengbiao; Liu, Liming

    2017-04-07

    Genome-scale metabolic models (GSMMs) constitute a platform that combines genome sequences and detailed biochemical information to quantify microbial physiology at the system level. To improve the unity, integrity, correctness, and format of data in published GSMMs, a consensus IMGMD database was built in the LAMP (Linux + Apache + MySQL + PHP) system by integrating and standardizing 328 GSMMs constructed for 139 microorganisms. The IMGMD database can help microbial researchers download manually curated GSMMs, rapidly reconstruct standard GSMMs, design pathways, and identify metabolic targets for strategies on strain improvement. Moreover, the IMGMD database facilitates the integration of wet-lab and in silico data to gain an additional insight into microbial physiology. The IMGMD database is freely available, without any registration requirements, at http://imgmd.jiangnan.edu.cn/database.

  11. PBX1 Genomic Pioneer Function Drives ERα Signaling Underlying Progression in Breast Cancer

    Science.gov (United States)

    Magnani, Luca; Ballantyne, Elizabeth B.; Zhang, Xiaoyang; Lupien, Mathieu

    2011-01-01

    Altered transcriptional programs are a hallmark of diseases, yet how these are established is still ill-defined. PBX1 is a TALE homeodomain protein involved in the development of different types of cancers. The estrogen receptor alpha (ERα) is central to the development of two-thirds of all breast cancers. Here we demonstrate that PBX1 acts as a pioneer factor and is essential for the ERα-mediated transcriptional response driving aggressive tumors in breast cancer. Indeed, PBX1 expression correlates with ERα in primary breast tumors, and breast cancer cells depleted of PBX1 no longer proliferate following estrogen stimulation. Profiling PBX1 recruitment and chromatin accessibility across the genome of breast cancer cells through ChIP-seq and FAIRE-seq reveals that PBX1 is loaded and promotes chromatin openness at specific genomic locations through its capacity to read specific epigenetic signatures. Accordingly, PBX1 guides ERα recruitment to a specific subset of sites. Expression profiling studies demonstrate that PBX1 controls over 70% of the estrogen response. More importantly, the PBX1-dependent transcriptional program is associated with poor-outcome in breast cancer patients. Correspondingly, PBX1 expression alone can discriminate a priori the outcome in ERα-positive breast cancer patients. These features are markedly different from the previously characterized ERα-associated pioneer factor FoxA1. Indeed, PBX1 is the only pioneer factor identified to date that discriminates outcome such as metastasis in ERα-positive breast cancer patients. Together our results reveal that PBX1 is a novel pioneer factor defining aggressive ERα-positive breast tumors, as it guides ERα genomic activity to unique genomic regions promoting a transcriptional program favorable to breast cancer progression. PMID:22125492

  12. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  13. In vitro analysis of integrated global high-resolution DNA methylation profiling with genomic imbalance and gene expression in osteosarcoma.

    Directory of Open Access Journals (Sweden)

    Bekim Sadikovic

    Full Text Available Genetic and epigenetic changes contribute to deregulation of gene expression and development of human cancer. Changes in DNA methylation are key epigenetic factors regulating gene expression and genomic stability. Recent progress in microarray technologies resulted in developments of high resolution platforms for profiling of genetic, epigenetic and gene expression changes. OS is a pediatric bone tumor with characteristically high level of numerical and structural chromosomal changes. Furthermore, little is known about DNA methylation changes in OS. Our objective was to develop an integrative approach for analysis of high-resolution epigenomic, genomic, and gene expression profiles in order to identify functional epi/genomic differences between OS cell lines and normal human osteoblasts. A combination of Affymetrix Promoter Tilling Arrays for DNA methylation, Agilent array-CGH platform for genomic imbalance and Affymetrix Gene 1.0 platform for gene expression analysis was used. As a result, an integrative high-resolution approach for interrogation of genome-wide tumour-specific changes in DNA methylation was developed. This approach was used to provide the first genomic DNA methylation maps, and to identify and validate genes with aberrant DNA methylation in OS cell lines. This first integrative analysis of global cancer-related changes in DNA methylation, genomic imbalance, and gene expression has provided comprehensive evidence of the cumulative roles of epigenetic and genetic mechanisms in deregulation of gene expression networks.

  14. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    Science.gov (United States)

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  15. Chloroplast genome of Aconitum barbatum var. puberulum (Ranunculaceae derived from CCS reads using the PacBio RS platform

    Directory of Open Access Journals (Sweden)

    Xiaochen eChen

    2015-02-01

    Full Text Available The chloroplast genome (cp genome of Aconitum barbatum var. puberulum was sequenced using the third-generation sequencing platform based on the single-molecule real-time (SMRT sequencing approach. To our knowledge, this is the first reported complete cp genome of Aconitum, and we anticipate that it will have great value for phylogenetic studies of the Ranunculaceae family. In total, 23,498 CCS reads and 20,685,462 base pairs were generated, the mean read length was 880 bp, and the longest read was 2,261 bp. Genome coverage of 100% was achieved with a mean coverage of 132× and no gaps. The accuracy of the assembled genome is 99.973%; the assembly was validated using Sanger sequencing of six selected genes from the cp genome. The complete cp genome of Aconitum barbatum var. puberulum is 156,749 bp in length, including a large single-copy region of 87,630 bp and a small single-copy region of 16,941 bp separated by two inverted repeats of 26,089 bp. The cp genome contains 130 genes, including 84 protein-coding genes, 34 tRNA genes and eight rRNA genes. Four forward, five inverted and eight tandem repeats were identified. According to the SSR analysis, the longest poly structure is a 20-T repeat. Our results presented in this paper will facilitate the phylogenetic studies and molecular authentication on Aconitum.

  16. Comparative genomic and functional analyses: unearthing the diversity and specificity of nematicidal factors in Pseudomonas putida strain 1A00316

    Science.gov (United States)

    Guo, Jing; Jing, Xueping; Peng, Wen-Lei; Nie, Qiyu; Zhai, Yile; Shao, Zongze; Zheng, Longyu; Cai, Minmin; Li, Guangyu; Zuo, Huaiyu; Zhang, Zhitao; Wang, Rui-Ru; Huang, Dian; Cheng, Wanli; Yu, Ziniu; Chen, Ling-Ling; Zhang, Jibin

    2016-01-01

    We isolated Pseudomonas putida (P. putida) strain 1A00316 from Antarctica. This bacterium has a high efficiency against Meloidogyne incognita (M. incognita) in vitro and under greenhouse conditions. The complete genome of P. putida 1A00316 was sequenced using PacBio single molecule real-time (SMRT) technology. A comparative genomic analysis of 16 Pseudomonas strains revealed that although P. putida 1A00316 belonged to P. putida, it was phenotypically more similar to nematicidal Pseudomonas fluorescens (P. fluorescens) strains. We characterized the diversity and specificity of nematicidal factors in P. putida 1A00316 with comparative genomics and functional analysis, and found that P. putida 1A00316 has diverse nematicidal factors including protein alkaline metalloproteinase AprA and two secondary metabolites, hydrogen cyanide and cyclo-(l-isoleucyl-l-proline). We show for the first time that cyclo-(l-isoleucyl-l-proline) exhibit nematicidal activity in P. putida. Interestingly, our study had not detected common nematicidal factors such as 2,4-diacetylphloroglucinol (2,4-DAPG) and pyrrolnitrin in P. putida 1A00316. The results of the present study reveal the diversity and specificity of nematicidal factors in P. putida strain 1A00316. PMID:27384076

  17. Annotating functional RNAs in genomes using Infernal.

    Science.gov (United States)

    Nawrocki, Eric P

    2014-01-01

    Many different types of functional non-coding RNAs participate in a wide range of important cellular functions but the large majority of these RNAs are not routinely annotated in published genomes. Several programs have been developed for identifying RNAs, including specific tools tailored to a particular RNA family as well as more general ones designed to work for any family. Many of these tools utilize covariance models (CMs), statistical models of the conserved sequence, and structure of an RNA family. In this chapter, as an illustrative example, the Infernal software package and CMs from the Rfam database are used to identify RNAs in the genome of the archaeon Methanobrevibacter ruminantium, uncovering some additional RNAs not present in the genome's initial annotation. Analysis of the results and comparison with family-specific methods demonstrate some important strengths and weaknesses of this general approach.

  18. Budding off: bringing functional genomics to Candida albicans

    Science.gov (United States)

    Anderson, Matthew Z.

    2016-01-01

    Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein–DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. PMID:26424829

  19. Exploring Protein Function Using the Saccharomyces Genome Database.

    Science.gov (United States)

    Wong, Edith D

    2017-01-01

    Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.

  20. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences.

    Science.gov (United States)

    Holmes, Christina; Carlson, Siobhan M; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-02

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics.

  1. CAGO: a software tool for dynamic visual comparison and correlation measurement of genome organization.

    Directory of Open Access Journals (Sweden)

    Yi-Feng Chang

    Full Text Available CAGO (Comparative Analysis of Genome Organization is developed to address two critical shortcomings of conventional genome atlas plotters: lack of dynamic exploratory functions and absence of signal analysis for genomic properties. With dynamic exploratory functions, users can directly manipulate chromosome tracks of a genome atlas and intuitively identify distinct genomic signals by visual comparison. Signal analysis of genomic properties can further detect inconspicuous patterns from noisy genomic properties and calculate correlations between genomic properties across various genomes. To implement dynamic exploratory functions, CAGO presents each genome atlas in Scalable Vector Graphics (SVG format and allows users to interact with it using a SVG viewer through JavaScript. Signal analysis functions are implemented using R statistical software and a discrete wavelet transformation package waveslim. CAGO is not only a plotter for generating complex genome atlases, but also a platform for exploring genome atlases with dynamic exploratory functions for visual comparison and with signal analysis for comparing genomic properties across multiple organisms. The web-based application of CAGO, its source code, user guides, video demos, and live examples are publicly available and can be accessed at http://cbs.ym.edu.tw/cago.

  2. Genomic analysis of Xenopus organizer function

    Directory of Open Access Journals (Sweden)

    Suhai Sándor

    2006-06-01

    Full Text Available Abstract Background Studies of the Xenopus organizer have laid the foundation for our understanding of the conserved signaling pathways that pattern vertebrate embryos during gastrulation. The two primary activities of the organizer, BMP and Wnt inhibition, can regulate a spectrum of genes that pattern essentially all aspects of the embryo during gastrulation. As our knowledge of organizer signaling grows, it is imperative that we begin knitting together our gene-level knowledge into genome-level signaling models. The goal of this paper was to identify complete lists of genes regulated by different aspects of organizer signaling, thereby providing a deeper understanding of the genomic mechanisms that underlie these complex and fundamental signaling events. Results To this end, we ectopically overexpress Noggin and Dkk-1, inhibitors of the BMP and Wnt pathways, respectively, within ventral tissues. After isolating embryonic ventral halves at early and late gastrulation, we analyze the transcriptional response to these molecules within the generated ectopic organizers using oligonucleotide microarrays. An efficient statistical analysis scheme, combined with a new Gene Ontology biological process annotation of the Xenopus genome, allows reliable and faithful clustering of molecules based upon their roles during gastrulation. From this data, we identify new organizer-related expression patterns for 19 genes. Moreover, our data sub-divides organizer genes into separate head and trunk organizing groups, which each show distinct responses to Noggin and Dkk-1 activity during gastrulation. Conclusion Our data provides a genomic view of the cohorts of genes that respond to Noggin and Dkk-1 activity, allowing us to separate the role of each in organizer function. These patterns demonstrate a model where BMP inhibition plays a largely inductive role during early developmental stages, thereby initiating the suites of genes needed to pattern dorsal tissues

  3. An integrated map of genetic variation from 1.092 human genomes

    DEFF Research Database (Denmark)

    Abecasis, Goncalo R.; Auton, Adam; Brooks, Lisa D.

    2012-01-01

    By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination ...

  4. Identification of two novel functional p53 responsive elements in the herpes simplex virus-1 genome.

    Science.gov (United States)

    Hsieh, Jui-Cheng; Kuta, Ryan; Armour, Courtney R; Boehmer, Paul E

    2014-07-01

    Analysis of the herpes simplex virus-1 (HSV-1) genome reveals two candidate p53 responsive elements (p53RE), located in proximity to the replication origins oriL and oriS, referred to as p53RE-L and p53RE-S, respectively. The sequences of p53RE-L and p53RE-S conform to the p53 consensus site and are present in HSV-1 strains KOS, 17, and F. p53 binds to both elements in vitro and in virus-infected cells. Both p53RE-L and p53RE-S are capable of conferring p53-dependent transcriptional activation onto a heterologous reporter gene. Importantly, expression of the essential immediate early viral transactivator ICP4 and the essential DNA replication protein ICP8, that are adjacent to p53RE-S and p53RE-L, are repressed in a p53-dependent manner. Taken together, this study identifies two novel functional p53RE in the HSV-1 genome and suggests a complex mechanism of viral gene regulation by p53 which may determine progression of the lytic viral replication cycle or the establishment of latency. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Empowered genome community: leveraging a bioinformatics platform as a citizen–scientist collaboration tool

    Directory of Open Access Journals (Sweden)

    Katherine Wendelsdorf

    2015-09-01

    Full Text Available There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy – to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen–scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public.

  6. Empowered genome community: leveraging a bioinformatics platform as a citizen-scientist collaboration tool.

    Science.gov (United States)

    Wendelsdorf, Katherine; Shah, Sohela

    2015-09-01

    There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS) technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy - to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC) is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen-scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public.

  7. Functional noncoding sequences derived from SINEs in the mammalian genome.

    Science.gov (United States)

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  8. Cas9 versus Cas12a/Cpf1: Structure-function comparisons and implications for genome editing.

    Science.gov (United States)

    Swarts, Daan C; Jinek, Martin

    2018-05-22

    Cas9 and Cas12a are multidomain CRISPR-associated nucleases that can be programmed with a guide RNA to bind and cleave complementary DNA targets. The guide RNA sequence can be varied, making these effector enzymes versatile tools for genome editing and gene regulation applications. While Cas9 is currently the best-characterized and most widely used nuclease for such purposes, Cas12a (previously named Cpf1) has recently emerged as an alternative for Cas9. Cas9 and Cas12a have distinct evolutionary origins and exhibit different structural architectures, resulting in distinct molecular mechanisms. Here we compare the structural and mechanistic features that distinguish Cas9 and Cas12a, and describe how these features modulate their activity. We discuss implications for genome editing, and how they may influence the choice of Cas9 or Cas12a for specific applications. Finally, we review recent studies in which Cas12a has been utilized as a genome editing tool. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications Regulatory RNAs/RNAi/Riboswitches > Biogenesis of Effector Small RNAs RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes. © 2018 Wiley Periodicals, Inc.

  9. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  10. Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals.

    Science.gov (United States)

    Kaneko-Ishino, Tomoko; Ishino, Fumitoshi

    2015-01-01

    Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is "mammalian-specific genomic functions", a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of "mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons", based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes.

  11. CRISPR/Cas9 for Human Genome Engineering and Disease Research.

    Science.gov (United States)

    Xiong, Xin; Chen, Meng; Lim, Wendell A; Zhao, Dehua; Qi, Lei S

    2016-08-31

    The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system, a versatile RNA-guided DNA targeting platform, has been revolutionizing our ability to modify, manipulate, and visualize the human genome, which greatly advances both biological research and therapeutics development. Here, we review the current development of CRISPR/Cas9 technologies for gene editing, transcription regulation, genome imaging, and epigenetic modification. We discuss the broad application of this system to the study of functional genomics, especially genome-wide genetic screening, and to therapeutics development, including establishing disease models, correcting defective genetic mutations, and treating diseases.

  12. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  13. Functional genomics in renal transplantation and chronic kidney disease

    International Nuclear Information System (INIS)

    Wilflingseder, J.

    2010-01-01

    For the past decade, the development of genomic technology has revolutionized modern biological research. Functional genomic analyses enable biologists to study genetic events on a genome wide scale. Examples of applications are gene discovery, biomarker determination, disease classification, and drug target identification. Global expression profiles performed with microarrays enable a better understanding of molecular signature of human disease, including acute and chronic kidney disease. About 10 % of the population in western industrialized nations suffers from chronic kidney disease (CKD). Treatment of end stage renal disease, the final stage of CKD is performed by either hemo- or peritoneal dialysis or renal transplantation. The preferred treatment is renal transplantation, because of the higher quality of life. But the pathophysiology of the disease on a molecular level is not well enough understood and early biomarkers for acute and chronic kidney disease are missing. In my studies I focused on genomics of allograft biopsies, prevention of delayed graft function after renal transplantation, anemia after renal transplantation, biocompatibility of hemodialysis membranes and peritoneal dialysis fluids and cardiovascular diseases and bone disorders in CKD patients. Gene expression profiles, pathway analysis and protein-protein interaction networks were used to elucidate the underlying pathophysiological mechanism of the disease or phenomena, identifying early biomarkers or predictors of disease state and potentially drug targets. In summery my PhD thesis represents the application of functional genomic analyses in chronic kidney disease and renal transplantation. The results provide a deeper view into the molecular and cellular mechanisms of kidney disease. Nevertheless, future multicenter collaborative studies, meta-analyses of existing data, incorporation of functional genomics into large-scale prospective clinical trials are needed and will give biomedical

  14. Resources for Functional Genomics Studies in Drosophila melanogaster

    Science.gov (United States)

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  15. Regulatory RNA-assisted genome engineering in microorganisms.

    Science.gov (United States)

    Si, Tong; HamediRad, Mohammad; Zhao, Huimin

    2015-12-01

    Regulatory RNAs are increasingly recognized and utilized as key modulators of gene expression in diverse organisms. Thanks to their modular and programmable nature, trans-acting regulatory RNAs are especially attractive in genome-scale applications. Here we discuss the recent examples in microbial genome engineering implementing various trans-acting RNA platforms, including sRNA, RNAi, asRNA and CRISRP-Cas. In particular, we focus on how the scalable and multiplex nature of trans-acting RNAs has been used to tackle the challenges in creating genome-wide and combinatorial diversity for functional genomics and metabolic engineering applications. Advances in computational design and context-dependent regulation are also discussed for their contribution in improving fine-tuning capabilities of trans-acting RNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. MIPS bacterial genomes functional annotation benchmark dataset.

    Science.gov (United States)

    Tetko, Igor V; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Fobo, Gisela; Ruepp, Andreas; Antonov, Alexey V; Surmeli, Dimitrij; Mewes, Hans-Wernen

    2005-05-15

    Any development of new methods for automatic functional annotation of proteins according to their sequences requires high-quality data (as benchmark) as well as tedious preparatory work to generate sequence parameters required as input data for the machine learning methods. Different program settings and incompatible protocols make a comparison of the analyzed methods difficult. The MIPS Bacterial Functional Annotation Benchmark dataset (MIPS-BFAB) is a new, high-quality resource comprising four bacterial genomes manually annotated according to the MIPS functional catalogue (FunCat). These resources include precalculated sequence parameters, such as sequence similarity scores, InterPro domain composition and other parameters that could be used to develop and benchmark methods for functional annotation of bacterial protein sequences. These data are provided in XML format and can be used by scientists who are not necessarily experts in genome annotation. BFAB is available at http://mips.gsf.de/proj/bfab

  17. Novel genomes and genome constitutions identified by GISH and 5S rDNA and knotted1 genomic sequences in the genus Setaria.

    Science.gov (United States)

    Zhao, Meicheng; Zhi, Hui; Doust, Andrew N; Li, Wei; Wang, Yongfang; Li, Haiquan; Jia, Guanqing; Wang, Yongqiang; Zhang, Ning; Diao, Xianmin

    2013-04-11

    The Setaria genus is increasingly of interest to researchers, as its two species, S. viridis and S. italica, are being developed as models for understanding C4 photosynthesis and plant functional genomics. The genome constitution of Setaria species has been studied in the diploid species S. viridis, S. adhaerans and S. grisebachii, where three genomes A, B and C were identified respectively. Two allotetraploid species, S. verticillata and S. faberi, were found to have AABB genomes, and one autotetraploid species, S. queenslandica, with an AAAA genome, has also been identified. The genomes and genome constitutions of most other species remain unknown, even though it was thought there are approximately 125 species in the genus distributed world-wide. GISH was performed to detect the genome constitutions of Eurasia species of S. glauca, S. plicata, and S. arenaria, with the known A, B and C genomes as probes. No or very poor hybridization signal was detected indicating that their genomes are different from those already described. GISH was also performed reciprocally between S. glauca, S. plicata, and S. arenaria genomes, but no hybridization signals between each other were found. The two sets of chromosomes of S. lachnea both hybridized strong signals with only the known C genome of S. grisebachii. Chromosomes of Qing 9, an accession formerly considered as S. viridis, hybridized strong signal only to B genome of S. adherans. Phylogenetic trees constructed with 5S rDNA and knotted1 markers, clearly classify the samples in this study into six clusters, matching the GISH results, and suggesting that the F genome of S. arenaria is basal in the genus. Three novel genomes in the Setaria genus were identified and designated as genome D (S. glauca), E (S. plicata) and F (S. arenaria) respectively. The genome constitution of tetraploid S. lachnea is putatively CCC'C'. Qing 9 is a B genome species indigenous to China and is hypothesized to be a newly identified species. The

  18. The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

    Energy Technology Data Exchange (ETDEWEB)

    Merchant, Sabeeha S

    2007-04-09

    Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the 120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

  19. Distinguishing between "function" and "effect" in genome biology.

    Science.gov (United States)

    Doolittle, W Ford; Brunet, Tyler D P; Linquist, Stefan; Gregory, T Ryan

    2014-05-09

    Much confusion in genome biology results from conflation of possible meanings of the word "function." We suggest that, in this connection, attention should be paid to evolutionary biologists and philosophers who have previously dealt with this problem. We need only decide that although all genomic structures have effects, only some of them should be said to have functions. Although it will very often be difficult or impossible to establish function (strictly defined), it should not automatically be assumed. We enjoin genomicists in particular to pay greater attention to parsing biological effects. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Plant Metabolomics : the missiong link in functional genomics strategies

    NARCIS (Netherlands)

    Hall, R.D.; Beale, M.; Fiehn, O.; Hardy, N.; Summer, L.; Bino, R.

    2002-01-01

    After the establishment of technologies for high-throughput DNA sequencing (genomics), gene expression analysis (transcriptomics), and protein analysis (proteomics), the remaining functional genomics challenge is that of metabolomics. Metabolomics is the term coined for essentially comprehensive,

  1. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

    Science.gov (United States)

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...

  2. Snap: an integrated SNP annotation platform

    DEFF Research Database (Denmark)

    Li, Shengting; Ma, Lijia; Li, Heng

    2007-01-01

    Snap (Single Nucleotide Polymorphism Annotation Platform) is a server designed to comprehensively analyze single genes and relationships between genes basing on SNPs in the human genome. The aim of the platform is to facilitate the study of SNP finding and analysis within the framework of medical...

  3. Identification of novel biomass-degrading enzymes from genomic dark matter: Populating genomic sequence space with functional annotation.

    Science.gov (United States)

    Piao, Hailan; Froula, Jeff; Du, Changbin; Kim, Tae-Wan; Hawley, Erik R; Bauer, Stefan; Wang, Zhong; Ivanova, Nathalia; Clark, Douglas S; Klenk, Hans-Peter; Hess, Matthias

    2014-08-01

    Although recent nucleotide sequencing technologies have significantly enhanced our understanding of microbial genomes, the function of ∼35% of genes identified in a genome currently remains unknown. To improve the understanding of microbial genomes and consequently of microbial processes it will be crucial to assign a function to this "genomic dark matter." Due to the urgent need for additional carbohydrate-active enzymes for improved production of transportation fuels from lignocellulosic biomass, we screened the genomes of more than 5,500 microorganisms for hypothetical proteins that are located in the proximity of already known cellulases. We identified, synthesized and expressed a total of 17 putative cellulase genes with insufficient sequence similarity to currently known cellulases to be identified as such using traditional sequence annotation techniques that rely on significant sequence similarity. The recombinant proteins of the newly identified putative cellulases were subjected to enzymatic activity assays to verify their hydrolytic activity towards cellulose and lignocellulosic biomass. Eleven (65%) of the tested enzymes had significant activity towards at least one of the substrates. This high success rate highlights that a gene context-based approach can be used to assign function to genes that are otherwise categorized as "genomic dark matter" and to identify biomass-degrading enzymes that have little sequence similarity to already known cellulases. The ability to assign function to genes that have no related sequence representatives with functional annotation will be important to enhance our understanding of microbial processes and to identify microbial proteins for a wide range of applications. © 2014 Wiley Periodicals, Inc.

  4. Functional genomics approaches in parasitic helminths.

    Science.gov (United States)

    Hagen, J; Lee, E F; Fairlie, W D; Kalinna, B H

    2012-01-01

    As research on parasitic helminths is moving into the post-genomic era, an enormous effort is directed towards deciphering gene function and to achieve gene annotation. The sequences that are available in public databases undoubtedly hold information that can be utilized for new interventions and control but the exploitation of these resources has until recently remained difficult. Only now, with the emergence of methods to genetically manipulate and transform parasitic worms will it be possible to gain a comprehensive understanding of the molecular mechanisms involved in nutrition, metabolism, developmental switches/maturation and interaction with the host immune system. This review focuses on functional genomics approaches in parasitic helminths that are currently used, to highlight potential applications of these technologies in the areas of cell biology, systems biology and immunobiology of parasitic helminths. © 2011 Blackwell Publishing Ltd.

  5. D2.1 Analysis of existing MOOC platforms and services

    NARCIS (Netherlands)

    Ortega, Sergio; Brouns, Francis; Fueyo Gutiérrez, Aquilina; Fano, Santiago; Tomasini, Alessandra; Silva, Alejandro; Rocio, Vítor; Jansen, Darco; Gutiérrez, Alfonso; Dornaletetxe, Jon; López, Esther; Fernández Pérez, María Dolores; Barbas, Ángel; Sáez Lopez, José Manuel

    2014-01-01

    The main objective of this task is to analyze features and services of MOOC platforms that are used in ECO and, secondly, in other commonly used MOOC platforms. This task takes into account the functionality that is required by the different pilots from two viewpoints: technological and pedagogical

  6. The Ssr protein (T1E_1405) from Pseudomonas putida DOT-T1E enables oligonucleotide-based recombineering in platform strain P. putida EM42

    DEFF Research Database (Denmark)

    Aparicio, Tomás; Ingemann Jensen, Sheila; Nielsen, Alex Toftgaard

    2016-01-01

    Some strains of the soil bacterium Pseudomonas putida have become in recent years platforms of choice for hosting biotransformations of industrial interest. Despite availability of many genetic tools for this microorganism, genomic editing of the cell factory P. putida EM42 (a derivative of refer......Some strains of the soil bacterium Pseudomonas putida have become in recent years platforms of choice for hosting biotransformations of industrial interest. Despite availability of many genetic tools for this microorganism, genomic editing of the cell factory P. putida EM42 (a derivative...

  7. Permanent Draft Genome of Strain ESFC-1: Ecological Genomics of a Newly Discovered Lineage of Filamentous Diazotrophic Cyanobacteria

    Science.gov (United States)

    Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; Detweiler, Angela M.; Lee, Jackson Zan; Woebken, Dagmar; Bebout, Leslie E.; Pett-Ridge, Jennifer

    2016-01-01

    The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. One striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. Additionally, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.

  8. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

    Science.gov (United States)

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

    2005-05-01

    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

  9. Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project : open letter

    NARCIS (Netherlands)

    Archibald, A.L.; Bottema, C.D.; Brauning, R.; Burgess, S.C.; Burt, D.W.; Casas, E.; Cheng, H.H.; Clarke, L.; Couldrey, C.; Dalrymple, B.P.; Elsik, C.G.; Foissac, S.; Giuffra, E.; Groenen, M.A.M.; Hayes, B.J.; Huang, L.S.; Khatib, H.; Kijas, J.W.; Kim, H.; Lunney, J.K.; McCarthy, F.M.; McEwan, J.; Moore, S.; Nanduri, B.; Notredame, C.; Palti, Y.; Plastow, G.S.; Reecy, J.M.; Rohrer, G.; Sarropoulou, E.; Schmidt, C.J.; Silverstein, J.; Tellam, R.L.; Tixier-Boichard, M.; Tosser-klopp, G.; Tuggle, C.K.; Vilkki, J.; White, S.N.; Zhao, S.; Zhou, H.

    2015-01-01

    We describe the organization of a nascent international effort, the Functional Annotation of Animal Genomes (FAANG) project, whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species.

  10. Toxicogenomics: Applications of new functional genomics technologies in toxicology

    NARCIS (Netherlands)

    Heijne, W.H.M.

    2004-01-01

    Toxicogenomics studies toxic effects of substances on organisms in relation to the composition of the genome. It applies the functional genomics technologies transcriptomics, proteomics and metabolomics that determine expression of the genes, proteins and metabolites in a sample. These methods could

  11. Brachypodium distachyon. A New Model System for Functional Genomics in Grasses1

    Science.gov (United States)

    Draper, John; Mur, Luis A.J.; Jenkins, Glyn; Ghosh-Biswas, Gadab C.; Bablak, Pauline; Hasterok, Robert; Routledge, Andrew P.M.

    2001-01-01

    A new model for grass functional genomics is described based on Brachypodium distachyon, which in the evolution of the Pooideae diverged just prior to the clade of “core pooid” genera that contain the majority of important temperate cereals and forage grasses. Diploid ecotypes of B. distachyon (2n = 10) have five easily distinguishable chromosomes that display high levels of chiasma formation at meiosis. The B. distachyon nuclear genome was indistinguishable in size from that of Arabidopsis, making it the simplest genome described in grasses to date. B. distachyon is a self-fertile, inbreeding annual with a life cycle of less than 4 months. These features, coupled with its small size (approximately 20 cm at maturity), lack of seed-head shatter, and undemanding growth requirements should make it amenable to high-throughput genetics and mutant screens. Immature embryos exhibited a high capacity for plant regeneration via somatic embryogenesis. Regenerated plants display very low levels of albinism and have normal fertility. A simple transformation system has been developed based on microprojectile bombardment of embryogenic callus and hygromycin selection. Selected B. distachyon ecotypes were resistant to all tested cereal-adapted Blumeria graminis species and cereal brown rusts (Puccinia reconditia). In contrast, different ecotypes displayed resistance or disease symptoms following challenge with the rice blast pathogen (Magnaporthe grisea) and wheat/barley yellow stripe rusts (Puccinia striformis). Despite its small stature, B. distachyon has large seeds that should prove useful for studies on grain filling. Such biological characteristics represent important traits for study in temperate cereals. PMID:11743099

  12. XRCC1 and PCNA are loading platforms with distinct kinetic properties and different capacities to respond to multiple DNA lesions

    Directory of Open Access Journals (Sweden)

    Leonhardt Heinrich

    2007-09-01

    Full Text Available Abstract Background Genome integrity is constantly challenged and requires the coordinated recruitment of multiple enzyme activities to ensure efficient repair of DNA lesions. We investigated the dynamics of XRCC1 and PCNA that act as molecular loading platforms and play a central role in this coordination. Results Local DNA damage was introduced by laser microirradation and the recruitment of fluorescent XRCC1 and PCNA fusion proteins was monitored by live cell microscopy. We found an immediate and fast recruitment of XRCC1 preceding the slow and continuous recruitment of PCNA. Fluorescence bleaching experiments (FRAP and FLIP revealed a stable association of PCNA with DNA repair sites, contrasting the high turnover of XRCC1. When cells were repeatedly challenged with multiple DNA lesions we observed a gradual depletion of the nuclear pool of PCNA, while XRCC1 dynamically redistributed even to lesions inflicted last. Conclusion These results show that PCNA and XRCC1 have distinct kinetic properties with functional consequences for their capacity to respond to successive DNA damage events.

  13. XRCC1 and PCNA are loading platforms with distinct kinetic properties and different capacities to respond to multiple DNA lesions

    Science.gov (United States)

    Mortusewicz, Oliver; Leonhardt, Heinrich

    2007-01-01

    Background Genome integrity is constantly challenged and requires the coordinated recruitment of multiple enzyme activities to ensure efficient repair of DNA lesions. We investigated the dynamics of XRCC1 and PCNA that act as molecular loading platforms and play a central role in this coordination. Results Local DNA damage was introduced by laser microirradation and the recruitment of fluorescent XRCC1 and PCNA fusion proteins was monitored by live cell microscopy. We found an immediate and fast recruitment of XRCC1 preceding the slow and continuous recruitment of PCNA. Fluorescence bleaching experiments (FRAP and FLIP) revealed a stable association of PCNA with DNA repair sites, contrasting the high turnover of XRCC1. When cells were repeatedly challenged with multiple DNA lesions we observed a gradual depletion of the nuclear pool of PCNA, while XRCC1 dynamically redistributed even to lesions inflicted last. Conclusion These results show that PCNA and XRCC1 have distinct kinetic properties with functional consequences for their capacity to respond to successive DNA damage events. PMID:17880707

  14. The Association Between Video Game Play and Cognitive Function: Does Gaming Platform Matter?

    Science.gov (United States)

    Huang, Vivian; Young, Michaelia; Fiocco, Alexandra J

    2017-11-01

    Despite consumer growth, few studies have evaluated the cognitive effects of gaming using mobile devices. This study examined the association between video game play platform and cognitive performance. Furthermore, the differential effect of video game genre (action versus nonaction) was explored. Sixty undergraduate students completed a video game experience questionnaire, and we divided them into three groups: mobile video game players (MVGPs), console/computer video game players (CVGPs), and nonvideo game players (NVGPs). Participants completed a cognitive battery to assess executive function, and learning and memory. Controlling for sex and ethnicity, analyses showed that frequent video game play is associated with enhanced executive function, but not learning and memory. MVGPs were significantly more accurate on working memory performances than NVGPs. Both MVGPs and CVGPs were similarly associated with enhanced cognitive function, suggesting that platform does not significantly determine the benefits of frequent video game play. Video game platform was found to differentially associate with preference for action video game genre and motivation for gaming. Exploratory analyses show that sex significantly effects frequent video game play, platform and genre preference, and cognitive function. This study represents a novel exploration of the relationship between mobile video game play and cognition and adds support to the cognitive benefits of frequent video game play.

  15. RPA-coated single-stranded DNA as a platform for post-translational modifications in the DNA damage response.

    Science.gov (United States)

    Maréchal, Alexandre; Zou, Lee

    2015-01-01

    The Replication Protein A (RPA) complex is an essential regulator of eukaryotic DNA metabolism. RPA avidly binds to single-stranded DNA (ssDNA) through multiple oligonucleotide/oligosaccharide-binding folds and coordinates the recruitment and exchange of genome maintenance factors to regulate DNA replication, recombination and repair. The RPA-ssDNA platform also constitutes a key physiological signal which activates the master ATR kinase to protect and repair stalled or collapsed replication forks during replication stress. In recent years, the RPA complex has emerged as a key target and an important regulator of post-translational modifications in response to DNA damage, which is critical for its genome guardian functions. Phosphorylation and SUMOylation of the RPA complex, and more recently RPA-regulated ubiquitination, have all been shown to control specific aspects of DNA damage signaling and repair by modulating the interactions between RPA and its partners. Here, we review our current understanding of the critical functions of the RPA-ssDNA platform in the maintenance of genome stability and its regulation through an elaborate network of covalent modifications.

  16. Mms1 binds to G-rich regions in Saccharomyces cerevisiae and influences replication and genome stability

    NARCIS (Netherlands)

    Wanzek, Katharina; Schwindt, Eike; Capra, John A.; Paeschke, Katrin

    2017-01-01

    The regulation of replication is essential to preserve genome integrity. Mms1 is part of the E3 ubiquitin ligase complex that is linked to replication fork progression. By identifying Mms1 binding sites genome-wide in Saccharomyces cerevisiae we connected Mms1 function to genome integrity and

  17. [Development of Plant Metabolomics and Medicinal Plant Genomics].

    Science.gov (United States)

    Saito, Kazuki

    2018-01-01

     A variety of chemicals produced by plants, often referred to as 'phytochemicals', have been used as medicines, food, fuels and industrial raw materials. Recent advances in the study of genomics and metabolomics in plant science have accelerated our understanding of the mechanisms, regulation and evolution of the biosynthesis of specialized plant products. We can now address such questions as how the metabolomic diversity of plants is originated at the levels of genome, and how we should apply this knowledge to drug discovery, industry and agriculture. Our research group has focused on metabolomics-based functional genomics over the last 15 years and we have developed a new research area called 'Phytochemical Genomics'. In this review, the development of a research platform for plant metabolomics is discussed first, to provide a better understanding of the chemical diversity of plants. Then, representative applications of metabolomics to functional genomics in a model plant, Arabidopsis thaliana, are described. The extension of integrated multi-omics analyses to non-model specialized plants, e.g., medicinal plants, is presented, including the identification of novel genes, metabolites and networks for the biosynthesis of flavonoids, alkaloids, sulfur-containing metabolites and terpenoids. Further, functional genomics studies on a variety of medicinal plants is presented. I also discuss future trends in pharmacognosy and related sciences.

  18. The functional genomic studies of curcumin.

    Science.gov (United States)

    Huminiecki, Lukasz; Horbańczuk, Jarosław; Atanasov, Atanas G

    2017-10-01

    Curcumin is a natural plant-derived compound that has attracted a lot of attention for its anti-cancer activities. Curcumin can slow proliferation of and induce apoptosis in cancer cell lines, but the precise mechanisms of these effects are not fully understood. However, many lines of evidence suggested that curcumin has a potent impact on gene expression profiles; thus, functional genomics should be the key to understanding how curcumin exerts its anti-cancer activities. Here, we review the published functional genomic studies of curcumin focusing on cancer. Typically, a cancer cell line or a grafted tumor were exposed to curcumin and profiled with microarrays, methylation assays, or RNA-seq. Crucially, these studies are in agreement that curcumin has a powerful effect on gene expression. In the majority of the studies, among differentially expressed genes we found genes involved in cell signaling, apoptosis, and the control of cell cycle. Curcumin can also induce specific methylation changes, and is a powerful regulator of the expression of microRNAs which control oncogenesis. We also reflect on how the broader technological progress in transcriptomics has been reflected on the field of curcumin. We conclude by discussing the areas where more functional genomic studies are highly desirable. Integrated OMICS approaches will clearly be the key to understanding curcumin's anticancer and chemopreventive effects. Such strategies may become a template for elucidating the mode of action of other natural products; many natural products have pleiotropic effects that are well suited for a systems-level analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Genome-wide comparative analysis of four Indian Drosophila species.

    Science.gov (United States)

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  20. Canine adenovirus type 2 vector generation via I-Sce1-mediated intracellular genome release.

    Directory of Open Access Journals (Sweden)

    Sandy Ibanes

    Full Text Available When canine adenovirus type 2 (CAdV-2, or also commonly referred to as CAV-2 vectors are injected into the brain parenchyma they preferentially transduce neurons, are capable of efficient axonal transport to afferent regions, and allow transgene expression for at last >1 yr. Yet, translating these data into a user-friendly vector platform has been limited because CAV-2 vector generation is challenging. Generation of E1-deleted adenovirus vectors often requires transfection of linear DNA fragments of >30 kb containing the vector genome into an E1-transcomplementing cell line. In contrast to human adenovirus type 5 vector generation, CAV-2 vector generation is less efficient due, in part, to a reduced ability to initiate replication and poor transfectibility of canine cells with large, linear DNA fragments. To improve CAV-2 vector generation, we generated an E1-transcomplementing cell line expressing the estrogen receptor (ER fused to I-SceI, a yeast meganuclease, and plasmids containing the I-SceI recognition sites flanking the CAV-2 vector genome. Using transfection of supercoiled plasmid and intracellular genome release via 4-OH-tamoxifen-induced nuclear translocation of I-SceI, we improved CAV-2 vector titers 1,000 fold, and in turn increased the efficacy of CAV-2 vector generation.

  1. Canine Adenovirus Type 2 Vector Generation via I-Sce1-Mediated Intracellular Genome Release

    Science.gov (United States)

    Ibanes, Sandy; Kremer, Eric J.

    2013-01-01

    When canine adenovirus type 2 (CAdV-2, or also commonly referred to as CAV-2) vectors are injected into the brain parenchyma they preferentially transduce neurons, are capable of efficient axonal transport to afferent regions, and allow transgene expression for at last >1 yr. Yet, translating these data into a user-friendly vector platform has been limited because CAV-2 vector generation is challenging. Generation of E1-deleted adenovirus vectors often requires transfection of linear DNA fragments of >30 kb containing the vector genome into an E1-transcomplementing cell line. In contrast to human adenovirus type 5 vector generation, CAV-2 vector generation is less efficient due, in part, to a reduced ability to initiate replication and poor transfectibility of canine cells with large, linear DNA fragments. To improve CAV-2 vector generation, we generated an E1-transcomplementing cell line expressing the estrogen receptor (ER) fused to I-SceI, a yeast meganuclease, and plasmids containing the I-SceI recognition sites flanking the CAV-2 vector genome. Using transfection of supercoiled plasmid and intracellular genome release via 4-OH-tamoxifen-induced nuclear translocation of I-SceI, we improved CAV-2 vector titers 1,000 fold, and in turn increased the efficacy of CAV-2 vector generation. PMID:23936483

  2. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures

    DEFF Research Database (Denmark)

    Stark, Alexander; Lin, Michael F; Kheradpour, Pouya

    2007-01-01

    Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional e...... individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies....

  3. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

    Science.gov (United States)

    Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

    2016-01-01

    Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor

  4. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

    Directory of Open Access Journals (Sweden)

    Wenning Zheng

    2016-03-01

    Full Text Available Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%, predicted hydrophobicity and molecular weight (Da using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1 client workstation, (2 web server, (3 application server, and (4 database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs, 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence

  5. Dramatic improvement in genome assembly achieved using doubled-haploid genomes.

    Science.gov (United States)

    Zhang, Hong; Tan, Engkong; Suzuki, Yutaka; Hirose, Yusuke; Kinoshita, Shigeharu; Okano, Hideyuki; Kudoh, Jun; Shimizu, Atsushi; Saito, Kazuyoshi; Watabe, Shugo; Asakawa, Shuichi

    2014-10-27

    Improvement in de novo assembly of large genomes is still to be desired. Here, we improved draft genome sequence quality by employing doubled-haploid individuals. We sequenced wildtype and doubled-haploid Takifugu rubripes genomes, under the same conditions, using the Illumina platform and assembled contigs with SOAPdenovo2. We observed 5.4-fold and 2.6-fold improvement in the sizes of the N50 contig and scaffold of doubled-haploid individuals, respectively, compared to the wildtype, indicating that the use of a doubled-haploid genome aids in accurate genome analysis.

  6. Selective Gene Delivery for Integrating Exogenous DNA into Plastid and Mitochondrial Genomes Using Peptide-DNA Complexes.

    Science.gov (United States)

    Yoshizumi, Takeshi; Oikawa, Kazusato; Chuah, Jo-Ann; Kodama, Yutaka; Numata, Keiji

    2018-05-14

    Selective gene delivery into organellar genomes (mitochondrial and plastid genomes) has been limited because of a lack of appropriate platform technology, even though these organelles are essential for metabolite and energy production. Techniques for selective organellar modification are needed to functionally improve organelles and produce transplastomic/transmitochondrial plants. However, no method for mitochondrial genome modification has yet been established for multicellular organisms including plants. Likewise, modification of plastid genomes has been limited to a few plant species and algae. In the present study, we developed ionic complexes of fusion peptides containing organellar targeting signal and plasmid DNA for selective delivery of exogenous DNA into the plastid and mitochondrial genomes of intact plants. This is the first report of exogenous DNA being integrated into the mitochondrial genomes of not only plants, but also multicellular organisms in general. This fusion peptide-mediated gene delivery system is a breakthrough platform for both plant organellar biotechnology and gene therapy for mitochondrial diseases in animals.

  7. Improvement in the amine glass platform by bubbling method for a DNA microarray

    Directory of Open Access Journals (Sweden)

    Jee SH

    2015-10-01

    Full Text Available Seung Hyun Jee,1 Jong Won Kim,2 Ji Hyeong Lee,2 Young Soo Yoon11Department of Chemical and Biological Engineering, Gachon University, Seongnam, Gyeonggi, Republic of Korea; 2Genomics Clinical Research Institute, LabGenomics Co., Ltd., Bundang-gu, Seongnam-si, Gyeonggi-do, Republic of KoreaAbstract: A glass platform with high sensitivity for sexually transmitted diseases microarray is described here. An amino-silane-based self-assembled monolayer was coated on the surface of a glass platform using a novel bubbling method. The optimized surface of the glass platform had highly uniform surface modifications using this method, as well as improved hybridization properties with capture probes in the DNA microarray. On the basis of these results, the improved glass platform serves as a highly reliable and optimal material for the DNA microarray. Moreover, in this study, we demonstrated that our glass platform, manufactured by utilizing the bubbling method, had higher uniformity, shorter processing time, lower background signal, and higher spot signal than the platforms manufactured by the general dipping method. The DNA microarray manufactured with a glass platform prepared using bubbling method can be used as a clinical diagnostic tool. Keywords: DNA microarray, glass platform, bubbling method, self-assambled monolayer

  8. The Influence of LINE-1 and SINE Retrotransposons on Mammalian Genomes.

    Science.gov (United States)

    Richardson, Sandra R; Doucet, Aurélien J; Kopera, Huira C; Moldovan, John B; Garcia-Perez, José Luis; Moran, John V

    2015-04-01

    Transposable elements have had a profound impact on the structure and function of mammalian genomes. The retrotransposon Long INterspersed Element-1 (LINE-1 or L1), by virtue of its replicative mobilization mechanism, comprises ∼17% of the human genome. Although the vast majority of human LINE-1 sequences are inactive molecular fossils, an estimated 80-100 copies per individual retain the ability to mobilize by a process termed retrotransposition. Indeed, LINE-1 is the only active, autonomous retrotransposon in humans and its retrotransposition continues to generate both intra-individual and inter-individual genetic diversity. Here, we briefly review the types of transposable elements that reside in mammalian genomes. We will focus our discussion on LINE-1 retrotransposons and the non-autonomous Short INterspersed Elements (SINEs) that rely on the proteins encoded by LINE-1 for their mobilization. We review cases where LINE-1-mediated retrotransposition events have resulted in genetic disease and discuss how the characterization of these mutagenic insertions led to the identification of retrotransposition-competent LINE-1s in the human and mouse genomes. We then discuss how the integration of molecular genetic, biochemical, and modern genomic technologies have yielded insight into the mechanism of LINE-1 retrotransposition, the impact of LINE-1-mediated retrotransposition events on mammalian genomes, and the host cellular mechanisms that protect the genome from unabated LINE-1-mediated retrotransposition events. Throughout this review, we highlight unanswered questions in LINE-1 biology that provide exciting opportunities for future research. Clearly, much has been learned about LINE-1 and SINE biology since the publication of Mobile DNA II thirteen years ago. Future studies should continue to yield exciting discoveries about how these retrotransposons contribute to genetic diversity in mammalian genomes.

  9. ABrowse - a customizable next-generation genome browser framework

    Directory of Open Access Journals (Sweden)

    Kong Lei

    2012-01-01

    Full Text Available Abstract Background With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Results Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. Conclusions ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for

  10. ABrowse--a customizable next-generation genome browser framework.

    Science.gov (United States)

    Kong, Lei; Wang, Jun; Zhao, Shuqi; Gu, Xiaocheng; Luo, Jingchu; Gao, Ge

    2012-01-05

    With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.

  11. ABrowse - a customizable next-generation genome browser framework

    Science.gov (United States)

    2012-01-01

    Background With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Results Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. Conclusions ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome

  12. NSD1 mutations generate a genome-wide DNA methylation signature.

    LENUS (Irish Health Repository)

    Choufani, S

    2015-12-22

    Sotos syndrome (SS) represents an important human model system for the study of epigenetic regulation; it is an overgrowth\\/intellectual disability syndrome caused by mutations in a histone methyltransferase, NSD1. As layered epigenetic modifications are often interdependent, we propose that pathogenic NSD1 mutations have a genome-wide impact on the most stable epigenetic mark, DNA methylation (DNAm). By interrogating DNAm in SS patients, we identify a genome-wide, highly significant NSD1(+\\/-)-specific signature that differentiates pathogenic NSD1 mutations from controls, benign NSD1 variants and the clinically overlapping Weaver syndrome. Validation studies of independent cohorts of SS and controls assigned 100% of these samples correctly. This highly specific and sensitive NSD1(+\\/-) signature encompasses genes that function in cellular morphogenesis and neuronal differentiation, reflecting cardinal features of the SS phenotype. The identification of SS-specific genome-wide DNAm alterations will facilitate both the elucidation of the molecular pathophysiology of SS and the development of improved diagnostic testing.

  13. CRISPR/Cas9 Platforms for Genome Editing in Plants: Developments and Applications.

    Science.gov (United States)

    Ma, Xingliang; Zhu, Qinlong; Chen, Yuanling; Liu, Yao-Guang

    2016-07-06

    The clustered regularly interspaced short palindromic repeat (CRISPR)-associated protein9 (Cas9) genome editing system (CRISPR/Cas9) is adapted from the prokaryotic type II adaptive immunity system. The CRISPR/Cas9 tool surpasses other programmable nucleases, such as ZFNs and TALENs, for its simplicity and high efficiency. Various plant-specific CRISPR/Cas9 vector systems have been established for adaption of this technology to many plant species. In this review, we present an overview of current advances on applications of this technology in plants, emphasizing general considerations for establishment of CRISPR/Cas9 vector platforms, strategies for multiplex editing, methods for analyzing the induced mutations, factors affecting editing efficiency and specificity, and features of the induced mutations and applications of the CRISPR/Cas9 system in plants. In addition, we provide a perspective on the challenges of CRISPR/Cas9 technology and its significance for basic plant research and crop genetic improvement. Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.

  14. Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome

    Directory of Open Access Journals (Sweden)

    McCarthy Fiona M

    2007-11-01

    Full Text Available Abstract Background The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology, we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and

  15. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    Science.gov (United States)

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. The functional genome of CA1 and CA3 neurons under native conditions and in response to ischemia

    Directory of Open Access Journals (Sweden)

    Rossner Moritz

    2007-10-01

    Full Text Available Abstract Background The different physiological repertoire of CA3 and CA1 neurons in the hippocampus, as well as their differing behaviour after noxious stimuli are ultimately based upon differences in the expressed genome. We have compared CA3 and CA1 gene expression in the uninjured brain, and after cerebral ischemia using laser microdissection (LMD, RNA amplification, and array hybridization. Results Profiling in CA1 vs. CA3 under normoxic conditions detected more than 1000 differentially expressed genes that belong to different, physiologically relevant gene ontology groups in both cell types. The comparison of each region under normoxic and ischemic conditions revealed more than 5000 ischemia-regulated genes for each individual cell type. Surprisingly, there was a high co-regulation in both regions. In the ischemic state, only about 100 genes were found to be differentially expressed in CA3 and CA1. The majority of these genes were also different in the native state. A minority of interesting genes (e.g. inhibinbetaA displayed divergent expression preference under native and ischemic conditions with partially opposing directions of regulation in both cell types. Conclusion The differences found in two morphologically very similar cell types situated next to each other in the CNS are large providing a rational basis for physiological differences. Unexpectedly, the genomic response to ischemia is highly similar in these two neuron types, leading to a substantial attenuation of functional genomic differences in these two cell types. Also, the majority of changes that exist in the ischemic state are not generated de novo by the ischemic stimulus, but are preexistant from the genomic repertoire in the native situation. This unexpected influence of a strong noxious stimulus on cell-specific gene expression differences can be explained by the activation of a cell-type independent conserved gene-expression program. Our data generate both novel

  17. Functional interrogation of non-coding DNA through CRISPR genome editing.

    Science.gov (United States)

    Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H

    2017-05-15

    Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Application and Mechanics Analysis of Multi-Function Construction Platforms in Prefabricated-Concrete Construction

    Science.gov (United States)

    Wang, Meihua; Li, Rongshuai; Zhang, Wenze

    2017-11-01

    Multi-function construction platforms (MCPs) as an “old construction technology, new application” of the building facade construction equipment, its efforts to reduce labour intensity, improve labour productivity, ensure construction safety, shorten the duration of construction and other aspects of the effect are significant. In this study, the functional analysis of the multi-function construction platforms is carried out in the construction of the assembly building. Based on the general finite element software ANSYS, the static calculation and dynamic characteristics analysis of the MCPs structure are analysed, the simplified finite element model is constructed, and the selection of the unit, the processing and solution of boundary are under discussion and research. The maximum deformation value, the maximum stress value and the structural dynamic characteristic model are obtained. The dangerous parts of the platform structure are analysed, too. Multiple types of MCPs under engineering construction conditions are calculated, so as to put forward the rationalization suggestions for engineering application of the MCPs.

  19. Cloning, functional characterization and genomic organization of 1,8-cineole synthases from Lavandula.

    Science.gov (United States)

    Demissie, Zerihun A; Cella, Monica A; Sarker, Lukman S; Thompson, Travis J; Rheault, Mark R; Mahmoud, Soheil S

    2012-07-01

    Several members of the genus Lavandula produce valuable essential oils (EOs) that are primarily constituted of the low molecular weight isoprenoids, particularly monoterpenes. We isolated over 8,000 ESTs from the glandular trichomes of L. x intermedia flowers (where bulk of the EO is synthesized) to facilitate the discovery of genes that control the biosynthesis of EO constituents. The expression profile of these ESTs in L. x intermedia and its parents L. angustifolia and L. latifolia was established using microarrays. The resulting data highlighted a differentially expressed, previously uncharacterized cDNA with strong homology to known 1,8-cineole synthase (CINS) genes. The ORF, excluding the transit peptide, of this cDNA was expressed in E. coli, purified by Ni-NTA agarose affinity chromatography and functionally characterized in vitro. The ca. 63 kDa bacterially produced recombinant protein, designated L. x intermedia CINS (LiCINS), converted geranyl diphosphate (the linear monoterpene precursor) primarily to 1,8-cineole with K ( m ) and k ( cat ) values of 5.75 μM and 8.8 × 10(-3) s(-1), respectively. The genomic DNA of CINS in the studied Lavandula species had identical exon-intron architecture and coding sequences, except for a single polymorphic nucleotide in the L. angustifolia ortholog which did not alter protein function. Additional nucleotide variations restricted to L. angustifolia introns were also observed, suggesting that LiCINS was most likely inherited from L. latifolia. The LiCINS mRNA levels paralleled the 1,8-cineole content in mature flowers of the three lavender species, and in developmental stages of L. x intermedia inflorescence indicating that the production of 1,8 cineole in Lavandula is most likely controlled through transcriptional regulation of LiCINS.

  20. Translational and functional oncogenomics. From cancer-oriented genomic screenings to new diagnostic tools and improved cancer treatment.

    Science.gov (United States)

    Medico, Enzo

    2008-01-01

    We present here an experimental pipeline for the systematic identification and functional characterization of genes with high potential diagnostic and therapeutic value in human cancer. Complementary competences and resources have been brought together in the TRANSFOG Consortium to reach the following integrated research objectives: 1) execution of cancer-oriented genomic screenings on tumor tissues and experimental models and merging of the results to generate a prioritized panel of candidate genes involved in cancer progression and metastasis; 2) setup of systems for high-throughput delivery of full-length cDNAs, for gain-of-function analysis of the prioritized candidate genes; 3) collection of vectors and oligonucleotides for systematic, RNA interference-mediated down-regulation of the candidate genes; 4) adaptation of existing cell-based and model organism assays to a systematic analysis of gain and loss of function of the candidate genes, for identification and preliminary validation of novel potential therapeutic targets; 5) proteomic analysis of signal transduction and protein-protein interaction for better dissection of aberrant cancer signaling pathways; 6) validation of the diagnostic potential of the identified cancer genes towards the clinical use of diagnostic molecular signatures; 7) generation of a shared informatics platform for data handling and gene functional annotation. The results of the first three years of activity of the TRANSFOG Consortium are also briefly presented and discussed.

  1. Genome-Wide Association and Functional Follow-Up Reveals New Loci for Kidney Function

    Science.gov (United States)

    Fuchsberger, Christian; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; O'Seaghdha, Conall M.; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V.; O'Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D.; Gierman, Hinco J.; Feitosa, Mary; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Chouraki, Vincent; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank B.; Demirkan, Ayse; Oostra, Ben A.; de Andrade, Mariza; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H.-Erich; Kolcic, Ivana; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Endlich, Karlhans; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Giulianini, Franco; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Metzger, Marie; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K.; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S.; van Duijn, Cornelia M.; Borecki, Ingrid; Kardia, Sharon L. R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline C. M.; Hayward, Caroline; Ridker, Paul; Parsa, Afshin; Bochud, Murielle; Heid, Iris M.; Goessling, Wolfram; Chasman, Daniel I.; Kao, W. H. Linda; Fox, Caroline S.

    2012-01-01

    Chronic kidney disease (CKD) is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR), the primary clinical measure of CKD, in or near MPPED2, DDX1, SLC47A1, CDK12, CASP9, and INO80. Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD. PMID:22479191

  2. Genome-wide association and functional follow-up reveals new loci for kidney function.

    Science.gov (United States)

    Pattaro, Cristian; Köttgen, Anna; Teumer, Alexander; Garnaas, Maija; Böger, Carsten A; Fuchsberger, Christian; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Chouraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank B; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H-Erich; Kolcic, Ivana; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Endlich, Karlhans; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Giulianini, Franco; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Metzger, Marie; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline C M; Hayward, Caroline; Ridker, Paul; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Goessling, Wolfram; Chasman, Daniel I; Kao, W H Linda; Fox, Caroline S

    2012-01-01

    Chronic kidney disease (CKD) is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR), the primary clinical measure of CKD, in or near MPPED2, DDX1, SLC47A1, CDK12, CASP9, and INO80. Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD.

  3. Genome-wide association and functional follow-up reveals new loci for kidney function.

    Directory of Open Access Journals (Sweden)

    Cristian Pattaro

    Full Text Available Chronic kidney disease (CKD is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR, the primary clinical measure of CKD, in or near MPPED2, DDX1, SLC47A1, CDK12, CASP9, and INO80. Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD.

  4. Functional genomics in forage and turf - present status and future ...

    African Journals Online (AJOL)

    The combination of bioinformatics and genomics will enhance our understanding ... This review focuses on recent advances and applications of functional genomics for large-scale EST projects, global gene expression analyses, proteomics, and ... ESTs, microarray, proteomics, metabolomics, Medicago truncatula, legume.

  5. Using functional genomics to study PINK1 and metabolic physiology

    DEFF Research Database (Denmark)

    Scheele, Camilla; Larsson, Ola; Timmons, James A

    2009-01-01

    Genome sequencing projects have provided the substrate for an unimaginable number of biological experiments. Further, genomic technologies such as microarrays and quantitative and exquisitely sensitive techniques such as real-time quantitative polymerase chain reaction have made it possible to re...... to be simpler than the in vivo mammalian tissue and thus the methods discussed largely apply to this cell biology phase. We apologize for not referring to all relevant publications and for any technical considerations we have also failed to factor into our discussion....

  6. Functional Analysis of Shewanella, a cross genome comparison.

    Energy Technology Data Exchange (ETDEWEB)

    Serres, Margrethe H.

    2009-05-15

    The bacterial genus Shewanella includes a group of highly versatile organisms that have successfully adapted to life in many environments ranging from aquatic (fresh and marine) to sedimentary (lake and marine sediments, subsurface sediments, sea vent). A unique respiratory capability of the Shewanellas, initially observed for Shewanella oneidensis MR-1, is the ability to use metals and metalloids, including radioactive compounds, as electron acceptors. Members of the Shewanella genus have also been shown to degrade environmental pollutants i.e. halogenated compounds, making this group highly applicable for the DOE mission. S. oneidensis MR-1 has in addition been found to utilize a diverse set of nutrients and to have a large set of genes dedicated to regulation and to sensing of the environment. The sequencing of the S. oneidensis MR-1 genome facilitated experimental and bioinformatics analyses by a group of collaborating researchers, the Shewanella Federation. Through the joint effort and with support from Department of Energy S. oneidensis MR-1 has become a model organism of study. Our work has been a functional analysis of S. oneidensis MR-1, both by itself and as part of a comparative study. We have improved the annotation of gene products, assigned metabolic functions, and analyzed protein families present in S. oneidensis MR-1. The data has been applied to analysis of experimental data (i.e. gene expression, proteome) generated for S. oneidensis MR-1. Further, this work has formed the basis for a comparative study of over 20 members of the Shewanella genus. The species and strains selected for genome sequencing represented an evolutionary gradient of DNA relatedness, ranging from close to intermediate, and to distant. The organisms selected have also adapted to a variety of ecological niches. Through our work we have been able to detect and interpret genome similarities and differences between members of the genus. We have in this way contributed to the

  7. ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling | Office of Cancer Genomics

    Science.gov (United States)

    Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives.

  8. A Rad53 independent function of Rad9 becomes crucial for genome maintenance in the absence of the Recq helicase Sgs1.

    Directory of Open Access Journals (Sweden)

    Ida Nielsen

    Full Text Available The conserved family of RecQ DNA helicases consists of caretaker tumour suppressors, that defend genome integrity by acting on several pathways of DNA repair that maintain genome stability. In budding yeast, Sgs1 is the sole RecQ helicase and it has been implicated in checkpoint responses, replisome stability and dissolution of double Holliday junctions during homologous recombination. In this study we investigate a possible genetic interaction between SGS1 and RAD9 in the cellular response to methyl methane sulphonate (MMS induced damage and compare this with the genetic interaction between SGS1 and RAD24. The Rad9 protein, an adaptor for effector kinase activation, plays well-characterized roles in the DNA damage checkpoint response, whereas Rad24 is characterized as a sensor protein also in the DNA damage checkpoint response. Here we unveil novel insights into the cellular response to MMS-induced damage. Specifically, we show a strong synergistic functionality between SGS1 and RAD9 for recovery from MMS induced damage and for suppression of gross chromosomal rearrangements, which is not the case for SGS1 and RAD24. Intriguingly, it is a Rad53 independent function of Rad9, which becomes crucial for genome maintenance in the absence of Sgs1. Despite this, our dissection of the MMS checkpoint response reveals parallel, but unequal pathways for Rad53 activation and highlights significant differences between MMS- and hydroxyurea (HU-induced checkpoint responses with relation to the requirement of the Sgs1 interacting partner Topoisomerase III (Top3. Thus, whereas earlier studies have documented a Top3-independent role of Sgs1 for an HU-induced checkpoint response, we show here that upon MMS treatment, Sgs1 and Top3 together define a minor but parallel pathway to that of Rad9.

  9. Millstone: software for multiplex microbial genome analysis and engineering.

    Science.gov (United States)

    Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  10. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function

    NARCIS (Netherlands)

    D.B. Hancock (Dana); M. Eijgelsheim (Mark); J.B. Wilk (Jemma); S.A. Gharib (Sina); L.R. Loehr (Laura); K. Marciante (Kristin); N. Franceschini (Nora); Y.M.T.A. van Durme; T.H. Chen; R.G. Barr (Graham); M.B. Schabath (Matthew); D.J. Couper (David); G.G. Brusselle (Guy); B.M. Psaty (Bruce); P. Tikka-Kleemola (Päivi); J.I. Rotter (Jerome); A.G. Uitterlinden (André); A. Hofman (Albert); N.M. Punjabi (Naresh); F. Rivadeneira Ramirez (Fernando); A.C. Morrison (Alanna); P.L. Enright (Paul); K.E. North (Kari); S.R. Heckbert (Susan); T. Lumley (Thomas); B.H.Ch. Stricker (Bruno); G.T. O'Connor (George); S.J. London (Stephanie)

    2010-01-01

    textabstractSpirometric measures of lung function are heritable traits that reflect respiratory health and predict morbidity and mortality. We meta-analyzed genome-wide association studies for two clinically important lung-function measures: forced expiratory volume in the first second (FEV1) and

  11. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors

    Directory of Open Access Journals (Sweden)

    Naume B

    2008-08-01

    Full Text Available Abstract Background Microarray Comparative Genomic Hybridization (array CGH provides a means to examine DNA copy number aberrations. Various platforms, brands and underlying technologies are available, facing the user with many choices regarding platform sensitivity and number, localization, and density distribution of probes. Results We evaluate three different platforms presenting different nature and arrangement of the probes: The Agilent Human Genome CGH Microarray 44 k, the ROMA/NimbleGen Representational Oligonucleotide Microarray 82 k, and the Illumina Human-1 Genotyping 109 k BeadChip, with Agilent being gene oriented, ROMA/NimbleGen being genome oriented, and Illumina being genotyping oriented. We investigated copy number changes in 20 human breast tumor samples representing different gene expression subclasses, using a suite of graphical and statistical methods designed to work across platforms. Despite substantial differences in the composition and spatial distribution of probes, the comparison revealed high overall concordance. Notably however, some short amplifications and deletions of potential biological importance were not detected by all platforms. Both correlation and cluster analysis indicate a somewhat higher similarity between ROMA/NimbleGen and Illumina than between Agilent and the other two platforms. The programs developed for the analysis are available from http://www.ifi.uio.no/bioinf/Projects/. Conclusion We conclude that platforms based on different technology principles reveal similar aberration patterns, although we observed some unique amplification or deletion peaks at various locations, only detected by one of the platforms. The correct platform choice for a particular study is dependent on whether the appointed research intention is gene, genome, or genotype oriented.

  12. The Use of Functional Genomics in Conjunction with Metabolomics for Mycobacterium tuberculosis Research

    Science.gov (United States)

    Swanepoel, Conrad C.

    2014-01-01

    Tuberculosis (TB), caused by Mycobacterium tuberculosis, is a fatal infectious disease, resulting in 1.4 million deaths globally per annum. Over the past three decades, genomic studies have been conducted in an attempt to elucidate the functionality of the genome of the pathogen. However, many aspects of this complex genome remain largely unexplored, as approaches like genomics, proteomics, and transcriptomics have failed to characterize them successfully. In turn, metabolomics, which is relatively new to the “omics” revolution, has shown great potential for investigating biological systems or their modifications. Furthermore, when these data are interpreted in combination with previously acquired genomics, proteomics and transcriptomics data, using what is termed a systems biology approach, a more holistic understanding of these systems can be achieved. In this review we discuss how metabolomics has contributed so far to characterizing TB, with emphasis on the resulting improved elucidation of M. tuberculosis in terms of (1) metabolism, (2) growth and replication, (3) pathogenicity, and (4) drug resistance, from the perspective of systems biology. PMID:24771957

  13. The Use of Functional Genomics in Conjunction with Metabolomics for Mycobacterium tuberculosis Research

    Directory of Open Access Journals (Sweden)

    Conrad C. Swanepoel

    2014-01-01

    Full Text Available Tuberculosis (TB, caused by Mycobacterium tuberculosis, is a fatal infectious disease, resulting in 1.4 million deaths globally per annum. Over the past three decades, genomic studies have been conducted in an attempt to elucidate the functionality of the genome of the pathogen. However, many aspects of this complex genome remain largely unexplored, as approaches like genomics, proteomics, and transcriptomics have failed to characterize them successfully. In turn, metabolomics, which is relatively new to the “omics” revolution, has shown great potential for investigating biological systems or their modifications. Furthermore, when these data are interpreted in combination with previously acquired genomics, proteomics and transcriptomics data, using what is termed a systems biology approach, a more holistic understanding of these systems can be achieved. In this review we discuss how metabolomics has contributed so far to characterizing TB, with emphasis on the resulting improved elucidation of M. tuberculosis in terms of (1 metabolism, (2 growth and replication, (3 pathogenicity, and (4 drug resistance, from the perspective of systems biology.

  14. [CRISPR/Cas system for genome editing in pluripotent stem cells].

    Science.gov (United States)

    Vasil'eva, E A; Melino, D; Barlev, N A

    2015-01-01

    Genome editing systems based on site-specific nucleases became very popular for genome editing in modern bioengineering. Human pluripotent stem cells provide a unique platform for genes function study, disease modeling, and drugs testing. Consequently, technology for fast, accurate and well controlled genome manipulation is required. CRISPR/Cas (clustered regularly interspaced short palindromic repeat/CRISPR-associated) system could be employed for these purposes. This system is based on site-specific programmable nuclease Cas9. Numerous advantages of the CRISPR/Cas system and its successful application to human stem cells provide wide opportunities for genome therapy and regeneration medicine. In this publication, we describe and compare the main genome editing systems based on site-specific programmable nucleases and discuss opportunities and perspectives of the CRISPR/Cas system for application to pluripotent stem cells.

  15. Synthetic Biology to Engineer Bacteriophage Genomes.

    Science.gov (United States)

    Rita Costa, Ana; Milho, Catarina; Azeredo, Joana; Pires, Diana Priscila

    2018-01-01

    Recent advances in the synthetic biology field have enabled the development of new molecular biology techniques used to build specialized bacteriophages with new functionalities. Bacteriophages have been engineered towards a wide range of applications including pathogen control and detection, targeted drug delivery, or even assembly of new materials.In this chapter, two strategies that have been successfully used to genetically engineer bacteriophage genomes are addressed: a yeast-based platform and bacteriophage recombineering of electroporated DNA.

  16. compendiumdb: an R package for retrieval and storage of functional genomics data

    NARCIS (Netherlands)

    Nandal, Umesh K.; van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Currently, the Gene Expression Omnibus (GEO) contains public data of over 1 million samples from more than 40 000 microarray-based functional genomics experiments. This provides a rich source of information for novel biological discoveries. However, unlocking this potential often requires retrieving

  17. Draft genome of the gayal, Bos frontalis

    Science.gov (United States)

    Wang, Ming-Shan; Zeng, Yan; Wang, Xiao; Nie, Wen-Hui; Wang, Jin-Huan; Su, Wei-Ting; Xiong, Zi-Jun; Wang, Sheng; Qu, Kai-Xing; Yan, Shou-Qing; Yang, Min-Min; Wang, Wen; Dong, Yang; Zhang, Ya-Ping

    2017-01-01

    Abstract Gayal (Bos frontalis), also known as mithan or mithun, is a large endangered semi-domesticated bovine that has a limited geographical distribution in the hill-forests of China, Northeast India, Bangladesh, Myanmar, and Bhutan. Many questions about the gayal such as its origin, population history, and genetic basis of local adaptation remain largely unresolved. De novo sequencing and assembly of the whole gayal genome provides an opportunity to address these issues. We report a high-depth sequencing, de novo assembly, and annotation of a female Chinese gayal genome. Based on the Illumina genomic sequencing platform, we have generated 350.38 Gb of raw data from 16 different insert-size libraries. A total of 276.86 Gb of clean data is retained after quality control. The assembled genome is about 2.85 Gb with scaffold and contig N50 sizes of 2.74 Mb and 14.41 kb, respectively. Repetitive elements account for 48.13% of the genome. Gene annotation has yielded 26 667 protein-coding genes, of which 97.18% have been functionally annotated. BUSCO assessment shows that our assembly captures 93% (3183 of 4104) of the core eukaryotic genes and 83.1% of vertebrate universal single-copy orthologs. We provide the first comprehensive de novo genome of the gayal. This genetic resource is integral for investigating the origin of the gayal and performing comparative genomic studies to improve understanding of the speciation and divergence of bovine species. The assembled genome could be used as reference in future population genetic studies of gayal. PMID:29048483

  18. Functional genomics of tomato

    Indian Academy of Sciences (India)

    2014-10-20

    Oct 20, 2014 ... 1Repository of Tomato Genomics Resources, Department of Plant Sciences, School .... Due to its position at the crossroads of Sanger's sequencing .... replacement for the microarray-based expression profiling. .... during RNA fragmentation step prior to library construction, ...... tomato pollen as a test case.

  19. CROPPER: a metagene creator resource for cross-platform and cross-species compendium studies.

    Science.gov (United States)

    Paananen, Jussi; Storvik, Markus; Wong, Garry

    2006-09-22

    Current genomic research methods provide researchers with enormous amounts of data. Combining data from different high-throughput research technologies commonly available in biological databases can lead to novel findings and increase research efficiency. However, combining data from different heterogeneous sources is often a very arduous task. These sources can be different microarray technology platforms, genomic databases, or experiments performed on various species. Our aim was to develop a software program that could facilitate the combining of data from heterogeneous sources, and thus allow researchers to perform genomic cross-platform/cross-species studies and to use existing experimental data for compendium studies. We have developed a web-based software resource, called CROPPER that uses the latest genomic information concerning different data identifiers and orthologous genes from the Ensembl database. CROPPER can be used to combine genomic data from different heterogeneous sources, allowing researchers to perform cross-platform/cross-species compendium studies without the need for complex computational tools or the requirement of setting up one's own in-house database. We also present an example of a simple cross-platform/cross-species compendium study based on publicly available Parkinson's disease data derived from different sources. CROPPER is a user-friendly and freely available web-based software resource that can be successfully used for cross-species/cross-platform compendium studies.

  20. Comparative Genomic Analysis of Lactobacillus plantarum GB-LP1 Isolated from Traditional Korean Fermented Food.

    Science.gov (United States)

    Yu, Jihyun; Ahn, Sojin; Kim, Kwondo; Caetano-Anolles, Kelsey; Lee, Chanho; Kang, Jungsun; Cho, Kyungjin; Yoon, Sook Hee; Kang, Dae-Kyung; Kim, Heebal

    2017-08-28

    As probiotics play an important role in maintaining a healthy gut flora environment through antitoxin activity and inhibition of pathogen colonization, they have been of interest to the medical research community for quite some time now. Probiotic bacteria such as Lactobacillus plantarum , which can be found in fermented food, are of particular interest given their easy accessibility. We performed whole-genome sequencing and genomic analysis on a GB-LP1 strain of L. plantarum isolated from Korean traditional fermented food; this strain is well known for its functions in immune response, suppression of pathogen growth, and antitoxin effects. The complete genome sequence of GB-LP1 is a single chromosome of 3,040,388 bp with 2,899 predicted open reading frames. Genomic analysis of GB-LP1 revealed two CRISPR regions and genes showing accelerated evolution, which may have antibiotic and antitoxin functions. The aim of the present study was to predict strain specific-genomic characteristics and assess the potential of this new strain as lactic acid bacteria at the genomic level using in silico analysis. These results provide insight into the L. plantarum species as well as confirm the possibility of its utility as a candidate probiotic.

  1. Comparative genomic analysis of single-molecule sequencing and hybrid approaches for finishing the Clostridium autoethanogenum JA1-1 strain DSM 10061 genome

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Steven D [ORNL; Nagaraju, Shilpa [LanzaTech; Utturkar, Sagar M [ORNL; De Tissera, Sashini [LanzaTech; Segovia, Simón [LanzaTech; Mitchell, Wayne [LanzaTech; Land, Miriam L [ORNL; Dassanayake, Asela [LanzaTech; Köpke, Michael [LanzaTech

    2014-01-01

    Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a

  2. GWIS: Genome-Wide Inferred Statistics for Functions of Multiple Phenotypes

    NARCIS (Netherlands)

    Nieuwboer, H.A.; Pool, R.; Dolan, C.V.; Boomsma, D.I.; Nivard, M.G.

    2016-01-01

    Here we present a method of genome-wide inferred study (GWIS) that provides an approximation of genome-wide association study (GWAS) summary statistics for a variable that is a function of phenotypes for which GWAS summary statistics, phenotypic means, and covariances are available. A GWIS can be

  3. Mutant power: using mutant allele collections for yeast functional genomics.

    Science.gov (United States)

    Norman, Kaitlyn L; Kumar, Anuj

    2016-03-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  4. An Upper Limit on the Functional Fraction of the Human Genome.

    Science.gov (United States)

    Graur, Dan

    2017-07-01

    For the human population to maintain a constant size from generation to generation, an increase in fertility must compensate for the reduction in the mean fitness of the population caused, among others, by deleterious mutations. The required increase in fertility due to this mutational load depends on the number of sites in the genome that are functional, the mutation rate, and the fraction of deleterious mutations among all mutations in functional regions. These dependencies and the fact that there exists a maximum tolerable replacement level fertility can be used to put an upper limit on the fraction of the human genome that can be functional. Mutational load considerations lead to the conclusion that the functional fraction within the human genome cannot exceed 25%, and is probably considerably lower. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. IVSPlat 1.0: an integrated virtual screening platform with a molecular graphical interface.

    Science.gov (United States)

    Sun, Yin Xue; Huang, Yan Xin; Li, Feng Li; Wang, Hong Yan; Fan, Cong; Bao, Yong Li; Sun, Lu Guo; Ma, Zhi Qiang; Kong, Jun; Li, Yu Xin

    2012-01-05

    The virtual screening (VS) of lead compounds using molecular docking and pharmacophore detection is now an important tool in drug discovery. VS tasks typically require a combination of several software tools and a molecular graphics system. Thus, the integration of all the requisite tools in a single operating environment could reduce the complexity of running VS experiments. However, only a few freely available integrated software platforms have been developed. A free open-source platform, IVSPlat 1.0, was developed in this study for the management and automation of VS tasks. We integrated several VS-related programs into a molecular graphics system to provide a comprehensive platform for the solution of VS tasks based on molecular docking, pharmacophore detection, and a combination of both methods. This tool can be used to visualize intermediate and final results of the VS execution, while also providing a clustering tool for the analysis of VS results. A case study was conducted to demonstrate the applicability of this platform. IVSPlat 1.0 provides a plug-in-based solution for the management, automation, and visualization of VS tasks. IVSPlat 1.0 is an open framework that allows the integration of extra software to extend its functionality and modified versions can be freely distributed. The open source code and documentation are available at http://kyc.nenu.edu.cn/IVSPlat/.

  6. Development of FuGO: An Ontology for Functional Genomics Investigations

    Science.gov (United States)

    Whetzel, Patricia L.; Brinkman, Ryan R.; Causton, Helen C.; Fan, Liju; Field, Dawn; Fostel, Jennifer; Fragoso, Gilberto; Gray, Tanya; Heiskanen, Mervi; Hernandez-Boussard, Tina; Morrison, Norman; Parkinson, Helen; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Schober, Daniel; Smith, Barry; Stevens, Robert; Stoeckert, Christian J.; Taylor, Chris; White, Joe; Wood, Andrew

    2009-01-01

    The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the “semantic glue” to provide a common understanding of data from across these disparate data sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans. PMID:16901226

  7. [Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].

    Science.gov (United States)

    Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong

    2015-11-01

    The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.

  8. Intelligent Parking Assistant - A Showcase of the MOBiNET Platform Functionalities

    DEFF Research Database (Denmark)

    Mikkelsen, Lars Møller; Toledo, Raphael; Agerholm, Niels

    availability, location and pricing schemes. The intelligence of the IPA system consists of automatically fetching parking lot information from the relevant service, automatically registering when the vehicle is stopped, evaluating whether the vehicle is stopped inside a payed parking lot, and automatically......The Intelligent Parking Assistant (IPA) system is developed based on work done in the Danish ITS Platform project. The IPA system includes an Android app that helps the user chose, find and pay for parkings. The app uses information about parking lots fetched from a web service, including live...... initiating the payment for the parking. The IPA app and the back end services, are published via the MOBiNET platform which is a European-wide platform supporting ITS services, and offering functionalities enabling easy migration of services. The migration is enabled by defining a common methodology...

  9. Detection of Inter-lineage Natural Recombination in Avian Paramyxovirus Serotype 1 using Simplified Deep Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Dilan Amila Satharasinghe

    2016-11-01

    Full Text Available Newcastle disease virus (NDV is a prototype member of avian paramyxovirus serotype 1 (APMV-1, which causes severe and contagious disease in the commercial poultry and wild birds. Despite extensive vaccination programs and other control measures, the disease remains endemic around the globe especially in Asia, Africa, and the Middle East. Being a single serotype, genotype II based vaccines remained most acceptable means of immunization. However, the evidence is emerging on failures of vaccines mainly due to evolving nature of the virus and higher genetic gaps between vaccine and field strains of APMV-1. Most of the epidemiological and genetic characterizations of APMVs are based on conventional methods, which are prone to mask the diverse population of viruses in complex samples. In this study, we report the application of a simple, robust, and less resource-demanding methodology for the whole genome sequencing of NDV, using next-generation sequencing on the Illumina MiSeq platform. Using this platform, we sequenced full genomes of five virulent Malaysian NDV strains collected during 2004-2013. All isolates clustered within highly prevalent lineage 5 (specifically in lineage 5a; however, a significantly greater genetic divergence was observed in isolates collected from 2004 to 2011. Interestingly, genetic characterization of one isolate collected in 2013 (IBS025/13 shown natural recombination between lineage 2 and lineage 5. In the event of recombination, the isolate (IBS025/13 carried nucleocapsid protein consist of 55-1801 nucleotides (nts and near-complete phosphoprotein (1804-3254 nts genes of lineage 2 whereas surface glycoproteins (fusion, hemagglutinin-neuraminidase and large polymerase of lineage 5. Additionally, the recombinant virus has a genome size of 15,186 nts which is characteristics for the old genotypes I to IV isolated from 1930 to 1960. Taken together, we report the occurrence of a natural recombination in circulating strains

  10. Detection of Inter-Lineage Natural Recombination in Avian Paramyxovirus Serotype 1 Using Simplified Deep Sequencing Platform.

    Science.gov (United States)

    Satharasinghe, Dilan A; Murulitharan, Kavitha; Tan, Sheau W; Yeap, Swee K; Munir, Muhammad; Ideris, Aini; Omar, Abdul R

    2016-01-01

    Newcastle disease virus (NDV) is a prototype member of avian paramyxovirus serotype 1 (APMV-1), which causes severe and contagious disease in the commercial poultry and wild birds. Despite extensive vaccination programs and other control measures, the disease remains endemic around the globe especially in Asia, Africa, and the Middle East. Being a single serotype, genotype II based vaccines remained most acceptable means of immunization. However, the evidence is emerging on failures of vaccines mainly due to evolving nature of the virus and higher genetic gaps between vaccine and field strains of APMV-1. Most of the epidemiological and genetic characterizations of APMVs are based on conventional methods, which are prone to mask the diverse population of viruses in complex samples. In this study, we report the application of a simple, robust, and less resource-demanding methodology for the whole genome sequencing of NDV, using next-generation sequencing (NGS) on the Illumina MiSeq platform. Using this platform, we sequenced full genomes of five virulent Malaysian NDV strains collected during 2004-2013. All isolates clustered within highly prevalent lineage 5 (specifically in lineage 5a); however, a significantly greater genetic divergence was observed in isolates collected from 2004 to 2011. Interestingly, genetic characterization of one isolate collected in 2013 (IBS025/13) shown natural recombination between lineage 2 and lineage 5. In the event of recombination, the isolate (IBS025/13) carried nucleocapsid protein consist of 55-1801 nucleotides (nts) and near-complete phosphoprotein (1804-3254 nts) genes of lineage 2 whereas surface glycoproteins (fusion, hemagglutinin-neuraminidase) and large polymerase of lineage 5. Additionally, the recombinant virus has a genome size of 15,186 nts which is characteristics for the old genotypes I-IV isolated from 1930 to 1960. Taken together, we report the occurrence of a natural recombination in circulating strains of NDV in

  11. Evolution of linear chromosomes and multipartite genomes in yeast mitochondria

    Science.gov (United States)

    Valach, Matus; Farkas, Zoltan; Fricova, Dominika; Kovac, Jakub; Brejova, Brona; Vinar, Tomas; Pfeiffer, Ilona; Kucsera, Judit; Tomaska, Lubomir; Lang, B. Franz; Nosek, Jozef

    2011-01-01

    Mitochondrial genome diversity in closely related species provides an excellent platform for investigation of chromosome architecture and its evolution by means of comparative genomics. In this study, we determined the complete mitochondrial DNA sequences of eight Candida species and analyzed their molecular architectures. Our survey revealed a puzzling variability of genome architecture, including circular- and linear-mapping and multipartite linear forms. We propose that the arrangement of large inverted repeats identified in these genomes plays a crucial role in alterations of their molecular architectures. In specific arrangements, the inverted repeats appear to function as resolution elements, allowing genome conversion among different topologies, eventually leading to genome fragmentation into multiple linear DNA molecules. We suggest that molecular transactions generating linear mitochondrial DNA molecules with defined telomeric structures may parallel the evolutionary emergence of linear chromosomes and multipartite genomes in general and may provide clues for the origin of telomeres and pathways implicated in their maintenance. PMID:21266473

  12. Development of electronic barcodes for use in plant pathology and functional genomics.

    Science.gov (United States)

    Kumagai, Monto H; Miller, Philip

    2006-06-01

    We have developed a novel 'electronic barcode' system that uses radio frequency identification (RFID) tags, cell phones, and portable computers to link phenotypic, environmental, and genomic data. We describe a secure, inexpensive system to record and retrieve data from plant samples. It utilizes RFID tags, computers, PDAs, and cell phones to link, record, and retrieve positional, and functional genomic data. Our results suggest that RFID tags can be used in functional genomic screens to record information that is involved in plant development or disease.

  13. Analysis of the complete genome sequence of Nocardia seriolae UTF1, the causative agent of fish nocardiosis: The first reference genome sequence of the fish pathogenic Nocardia species.

    Science.gov (United States)

    Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Nakamura, Yoji; Fujiwara, Atushi; Shimahara, Yoshiko; Kamaishi, Takashi; Yoshida, Terutoyo; Nagai, Satoshi; Kobayashi, Takanori; Katoh, Masaya

    2017-01-01

    Nocardiosis caused by Nocardia seriolae is one of the major threats in the aquaculture of Seriola species (yellowtail; S. quinqueradiata, amberjack; S. dumerili and kingfish; S. lalandi) in Japan. Here, we report the complete nucleotide genome sequence of N. seriolae UTF1, isolated from a cultured yellowtail. The genome is a circular chromosome of 8,121,733 bp with a G+C content of 68.1% that encodes 7,697 predicted proteins. In the N. seriolae UTF1 predicted genes, we found orthologs of virulence factors of pathogenic mycobacteria and human clinical Nocardia isolates involved in host cell invasion, modulation of phagocyte function and survival inside the macrophages. The virulence factor candidates provide an essential basis for understanding their pathogenic mechanisms at the molecular level by the fish nocardiosis research community in future studies. We also found many potential antibiotic resistance genes on the N. seriolae UTF1 chromosome. Comparative analysis with the four existing complete genomes, N. farcinica IFM 10152, N. brasiliensis HUJEG-1 and N. cyriacigeorgica GUH-2 and N. nova SH22a, revealed that 2,745 orthologous genes were present in all five Nocardia genomes (core genes) and 1,982 genes were unique to N. seriolae UTF1. In particular, the N. seriolae UTF1 genome contains a greater number of mobile elements and genes of unknown function that comprise the differences in structure and gene content from the other Nocardia genomes. In addition, a lot of the N. seriolae UTF1-specific genes were assigned to the ABC transport system. Because of limited resources in ocean environments, these N. seriolae UTF1 specific ABC transporters might facilitate adaptation strategies essential for marine environment survival. Thus, the availability of the complete N. seriolae UTF1 genome sequence will provide a valuable resource for comparative genomic studies of N. seriolae isolates, as well as provide new insights into the ecological and functional diversity of

  14. Data for constructing insect genome content matrices for phylogenetic analysis and functional annotation

    Directory of Open Access Journals (Sweden)

    Jeffrey Rosenfeld

    2016-03-01

    Full Text Available Twenty one fully sequenced and well annotated insect genomes were used to construct genome content matrices for phylogenetic analysis and functional annotation of insect genomes. To examine the role of e-value cutoff in ortholog determination we used scaled e-value cutoffs and a single linkage clustering approach.. The present communication includes (1 a list of the genomes used to construct the genome content phylogenetic matrices, (2 a nexus file with the data matrices used in phylogenetic analysis, (3 a nexus file with the Newick trees generated by phylogenetic analysis, (4 an excel file listing the Core (CORE genes and Unique (UNI genes found in five insect groups, and (5 a figure showing a plot of consistency index (CI versus percent of unannotated genes that are apomorphies in the data set for gene losses and gains and bar plots of gains and losses for four consistency index (CI cutoffs.

  15. The functional matrix hypothesis revisited. 3. The genomic thesis.

    Science.gov (United States)

    Moss, M L

    1997-09-01

    Although the initial versions of the functional matrix hypothesis (FMH) theoretically posited the ontogenetic primacy of "function," it is only in recent years that advances in the morphogenetic, engineering, and computer sciences provided an integrated experimental and numerical data base that permitted recent significant revisions of the FMH--revisions that strongly support the primary role of function in craniofacial growth and development. Acknowledging that the currently dominant scientific paradigm suggests that genomic, instead of epigenetic (functional) factors, regulate (cause, control) such growth, an analysis of this continuing controversy was deemed useful. Accordingly the method of dialectical analysis, is employed, stating a thesis, an antithesis, and a resolving synthesis based primarily on an extensive review of the pertinent current literature. This article extensively reviews the genomic hypothesis and offers a critique intended to remove some of the unintentional conceptual obscurantism that has recently come to surround it.

  16. A functional genomics study of extracellular protease production by Aspergillus niger

    OpenAIRE

    Braaksma, Machtelt

    2010-01-01

    The objective of the project described in this thesis was to study the complex induction of extracellular proteases in the filamentous fungus Aspergillus niger using information gathered with functional genomics technologies. A special emphasis is given to the requirements for performing a successful systems biology study and addressing the challenges met in analyzing the large, information-rich data sets generated with functional genomics technologies. The role that protease activity plays i...

  17. Functional genomics strategies with transposons in rice

    NARCIS (Netherlands)

    Greco, R.

    2003-01-01

    Rice is a major staple food crop and a recognizedmonocotylenedousmodel plant from which gene function discovery is projected to contribute to improvements in a variety of cereals like wheat and maize. The recent release of rough drafts of the rice genome sequence for public

  18. Functional annotation from the genome sequence of the giant panda.

    Science.gov (United States)

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-08-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

  19. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python. We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1 production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2 enhanced assisted reproduction technology for endangered and captive reptiles; and (3 novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  20. Cattle genomics and its implications for future nutritional strategies for dairy cattle.

    Science.gov (United States)

    Seo, S; Larkin, D M; Loor, J J

    2013-03-01

    The recently sequenced cattle (Bos taurus) genome unraveled the unique genomic features of the species and provided the molecular basis for applying a systemic approach to systematically link genomic information to metabolic traits. Comparative analysis has identified a variety of evolutionary adaptive features in the cattle genome, such as an expansion of the gene families related to the rumen function, large number of chromosomal rearrangements affecting regulation of genes for lactation, and chromosomal rearrangements that are associated with segmental duplications and copy number variations. Metabolic reconstruction of the cattle genome has revealed that core metabolic pathways are highly conserved among mammals although five metabolic genes are deleted or highly diverged and seven metabolic genes are present in duplicate in the cattle genome compared to their human counter parts. The evolutionary loss and gain of metabolic genes in the cattle genome may reflect metabolic adaptations of cattle. Metabolic reconstruction also provides a platform for better understanding of metabolic regulation in cattle and ruminants. A substantial body of transcriptomics data from dairy and beef cattle under different nutritional management and across different stages of growth and lactation are already available and will aid in linking the genome with metabolism and nutritional physiology of cattle. Application of cattle genomics has great potential for future development of nutritional strategies to improve efficiency and sustainability of beef and milk production. One of the biggest challenges is to integrate genomic and phenotypic data and interpret them in a biological and practical platform. Systems biology, a holistic and systemic approach, will be very useful in overcoming this challenge.

  1. A plant-based chemical genomics screen for the identification of flowering inducers.

    Science.gov (United States)

    Fiers, Martijn; Hoogenboom, Jorin; Brunazzi, Alice; Wennekes, Tom; Angenent, Gerco C; Immink, Richard G H

    2017-01-01

    Floral timing is a carefully regulated process, in which the plant determines the optimal moment to switch from the vegetative to reproductive phase. While there are numerous genes known that control flowering time, little information is available on chemical compounds that are able to influence this process. We aimed to discover novel compounds that are able to induce flowering in the model plant Arabidopsis. For this purpose we developed a plant-based screening platform that can be used in a chemical genomics study. Here we describe the set-up of the screening platform and various issues and pitfalls that need to be addressed in order to perform a chemical genomics screening on Arabidopsis plantlets. We describe the choice for a molecular marker, in combination with a sensitive reporter that's active in plants and is sufficiently sensitive for detection. In this particular screen, the firefly Luciferase marker was used, fused to the regulatory sequences of the floral meristem identity gene APETALA1 (AP1) , which is an early marker for flowering. Using this screening platform almost 9000 compounds were screened, in triplicate, in 96-well plates at a concentration of 25 µM. One of the identified potential flowering inducing compounds was studied in more detail and named Flowering1 (F1). F1 turned out to be an analogue of the plant hormone Salicylic acid (SA) and appeared to be more potent than SA in the induction of flowering. The effect could be confirmed by watering Arabidopsis plants with SA or F1, in which F1 gave a significant reduction in time to flowering in comparison to SA treatment or the control. In this study a chemical genomics screening platform was developed to discover compounds that can induce flowering in Arabidopsis. This platform was used successfully, to identify a compound that can speed-up flowering in Arabidopsis.

  2. Comparative genome analysis of the high pathogenicity Salmonella Typhimurium strain UK-1.

    Directory of Open Access Journals (Sweden)

    Yingqin Luo

    Full Text Available Salmonella enterica serovar Typhimurium, a gram-negative facultative rod-shaped bacterium causing salmonellosis and foodborne disease, is one of the most common isolated Salmonella serovars in both developed and developing nations. Several S. Typhimurium genomes have been completed and many more genome-sequencing projects are underway. Comparative genome analysis of the multiple strains leads to a better understanding of the evolution of S. Typhimurium and its pathogenesis. S. Typhimurium strain UK-1 (belongs to phage type 1 is highly virulent when orally administered to mice and chickens and efficiently colonizes lymphoid tissues of these species. These characteristics make this strain a good choice for use in vaccine development. In fact, UK-1 has been used as the parent strain for a number of nonrecombinant and recombinant vaccine strains, including several commercial vaccines for poultry. In this study, we conducted a thorough comparative genome analysis of the UK-1 strain with other S. Typhimurium strains and examined the phenotypic impact of several genomic differences. Whole genomic comparison highlights an extremely close relationship between the UK-1 strain and other S. Typhimurium strains; however, many interesting genetic and genomic variations specific to UK-1 were explored. In particular, the deletion of a UK-1-specific gene that is highly similar to the gene encoding the T3SS effector protein NleC exhibited a significant decrease in oral virulence in BALB/c mice. The complete genetic complements in UK-1, especially those elements that contribute to virulence or aid in determining the diversity within bacterial species, provide key information in evaluating the functional characterization of important genetic determinants and for development of vaccines.

  3. Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.

    Science.gov (United States)

    Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue

    2013-03-01

    The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.

  4. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    Science.gov (United States)

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  5. Reverse gyrase functions in genome integrity maintenance by protecting DNA breaks in vivo

    DEFF Research Database (Denmark)

    Han, Wenyuan; Feng, Xu; She, Qunxin

    2017-01-01

    Reverse gyrase introduces positive supercoils to circular DNA and is implicated in genome stability maintenance in thermophiles. The extremely thermophilic crenarchaeon Sulfolobus encodes two reverse gyrase proteins, TopR1 (topoisomerase reverse gyrase 1) and TopR2, whose functions in thermophilic...... and subsequent DNA degradation. The former occurred immediately after drug treatment, leading to chromosomal DNA degradation that concurred with TopR1 degradation, followed by chromatin protein degradation and DNA-less cell formation. To gain a further insight into TopR1 function, the expression of the enzyme...

  6. Enabling a Community to Dissect an Organism: Overview of the Neurospora Functional Genomics Project

    OpenAIRE

    Dunlap, Jay C.; Borkovich, Katherine A.; Henn, Matthew R.; Turner, Gloria E.; Sachs, Matthew S.; Glass, N. Louise; McCluskey, Kevin; Plamann, Michael; Galagan, James E.; Birren, Bruce W.; Weiss, Richard L.; Townsend, Jeffrey P.; Loros, Jennifer J.; Nelson, Mary Anne; Lambreghts, Randy

    2007-01-01

    A consortium of investigators is engaged in a functional genomics project centered on the filamentous fungus Neurospora, with an eye to opening up the functional genomic analysis of all the filamentous fungi. The overall goal of the four interdependent projects in this effort is to acccomplish functional genomics, annotation, and expression analyses of Neurospora crassa, a filamentous fungus that is an established model for the assemblage of over 250,000 species of nonyeast fungi. Building fr...

  7. A bacterial genetic screen identifies functional coding sequences of the insect mariner transposable element Famar1 amplified from the genome of the earwig, Forficula auricularia.

    Science.gov (United States)

    Barry, Elizabeth G; Witherspoon, David J; Lampe, David J

    2004-02-01

    Transposons of the mariner family are widespread in animal genomes and have apparently infected them by horizontal transfer. Most species carry only old defective copies of particular mariner transposons that have diverged greatly from their active horizontally transferred ancestor, while a few contain young, very similar, and active copies. We report here the use of a whole-genome screen in bacteria to isolate somewhat diverged Famar1 copies from the European earwig, Forficula auricularia, that encode functional transposases. Functional and nonfunctional coding sequences of Famar1 and nonfunctional copies of Ammar1 from the European honey bee, Apis mellifera, were sequenced to examine their molecular evolution. No selection for sequence conservation was detected in any clade of a tree derived from these sequences, not even on branches leading to functional copies. This agrees with the current model for mariner transposon evolution that expects neutral evolution within particular hosts, with selection for function occurring only upon horizontal transfer to a new host. Our results further suggest that mariners are not finely tuned genetic entities and that a greater amount of sequence diversification than had previously been appreciated can occur in functional copies in a single host lineage. Finally, this method of isolating active copies can be used to isolate other novel active transposons without resorting to reconstruction of ancestral sequences.

  8. Partnering for functional genomics research conference: Abstracts of poster presentations

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.

  9. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  10. Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources.

    Directory of Open Access Journals (Sweden)

    Stéphane Vuilleumier

    Full Text Available Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared.The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17 that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name "island integration determinant" (iid.These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire

  11. Methylobacterium genome sequences: a reference blueprint to investigate microbial metabolism of C1 compounds from natural and industrial sources.

    Science.gov (United States)

    Vuilleumier, Stéphane; Chistoserdova, Ludmila; Lee, Ming-Chun; Bringel, Françoise; Lajus, Aurélie; Zhou, Yang; Gourion, Benjamin; Barbe, Valérie; Chang, Jean; Cruveiller, Stéphane; Dossat, Carole; Gillett, Will; Gruffaz, Christelle; Haugen, Eric; Hourcade, Edith; Levy, Ruth; Mangenot, Sophie; Muller, Emilie; Nadalig, Thierry; Pagni, Marco; Penny, Christian; Peyraud, Rémi; Robinson, David G; Roche, David; Rouy, Zoé; Saenampechek, Channakhone; Salvignol, Grégory; Vallenet, David; Wu, Zaining; Marx, Christopher J; Vorholt, Julia A; Olson, Maynard V; Kaul, Rajinder; Weissenbach, Jean; Médigue, Claudine; Lidstrom, Mary E

    2009-01-01

    Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared. The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm) gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau) gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17) that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name "island integration determinant" (iid). These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire methylotrophic

  12. Biofilm function and variability in a hydrothermal ecosystem: insights from environmental genomes

    Science.gov (United States)

    Meyer-Dombard, D. R.; Raymond, J.; Shock, E. L.

    2007-12-01

    The ability to adapt to variable environmental conditions is key to survival for all organisms, but may be especially crucial to microorganisms in extreme environments such as hydrothermal systems. Streamer biofilm communities (SBCs) made up of thermophilic chemotrophic microorganisms are common in alkaline-chloride geothermal environments worldwide, but the in situ physiochemical growth parameters and requirements of SBCs are largely unknown [1]. Hot springs in Yellowstone National Park's alkaline geyser basins support SBC growth. However, despite the relative geochemical homogeneity of source pools and widespread ecosystem suitability in these regions (as indicated by energetic profiling [2]), SBCs are not ubiquitous in these ecosystems. The ability of hydrothermal systems to support the growth of SBCs, the relationship between these geochemically driven environments and the microbes that live there, and the function of individuals in these communities are aspects that are adressed here by applying environmental genomics. Analysis of 16S rRNA and total membrane lipid extracts have revealed that community composition of SBCs in "Bison Pool" varies as a function of changing environmental conditions along the outflow channel. In addition, a significant crenarchaeal component was discovered in the "Bison Pool" SBCs. In general, the SBC bacterial diversity triples while the archaeal component varies little (from 3 to 2 genera) in a 5-10°C gradient with distance from the source. While these SBCs are low in overall diversity, the majority of the taxa identified represent uncultured groups of Bacteria and Archaea. As a result, the community function of these taxa and their role in the formation of the biofilms is unknown. However, recent genomic analysis from environmental DNA affords insight into the roles of specific organisms within SBCs at "Bison Pool," and integration of these data with an extensive corresponding geochemical dataset may indicate shifting community

  13. Reciprocal Regulation between DNA-PKcs and Snail1 Conferring Genomic Instability

    International Nuclear Information System (INIS)

    Seo, Haeng Ran; Lee, Hae June; Jin, Yeung Bae; Bae, Sang Woo; Lee, Yun Sil; Kim, Nam Hee; Kim, Hyun Sil; Nam, Hyung Wook; Yook, Jong In

    2010-01-01

    Although the roles of DNA-dependent protein kinase catalytic subunit (DNA-PKcs) involving non-homologous end joining (NHEJ) of DNA repair are well recognized, the biological mechanisms and regulators by which DNA-PKcs regulate genomic instability are not clearly defined. We show herein that DNA-PKcs activity resulting from DNA damage caused by ionizing radiation (IR) phosphorylates Snail1 at serine 100, which results in increased Snail1 expression and its function by inhibition of GSK-3-mediated phosphorylation. Furthermore, Snail1 phosphorylated at serine 100 can reciprocally inhibit kinase activity of DNA-PKcs, resulting in an inhibition to recruit DNA-PKcs or Ku70/80 to a DNA double-strand break site, and ultimately inhibition of DNA repair activity. The impairment of repair activity by a direct interaction between Snail1 and DNA-PKcs increases the resistance to DNA damaging agents, such as IR, and genomic instability. Our findings provide a novel cellular mechanism for induction of genomic instability by reciprocal regulation of DNA-PKcs and Snail1

  14. NMR Studies of the Structure and Function of the HIV-1 5′-Leader

    Directory of Open Access Journals (Sweden)

    Sarah C. Keane

    2016-12-01

    Full Text Available The 5′-leader of the human immunodeficiency virus type 1 (HIV-1 genome plays several critical roles during viral replication, including differentially establishing mRNA versus genomic RNA (gRNA fates. As observed for proteins, the function of the RNA is tightly regulated by its structure, and a common paradigm has been that genome function is temporally modulated by structural changes in the 5′-leader. Over the past 30 years, combinations of nucleotide reactivity mapping experiments with biochemistry, mutagenesis, and phylogenetic studies have provided clues regarding the secondary structures of stretches of residues within the leader that adopt functionally discrete domains. More recently, nuclear magnetic resonance (NMR spectroscopy approaches have been developed that enable direct detection of intra- and inter-molecular interactions within the intact leader, providing detailed insights into the structural determinants and mechanisms that regulate HIV-1 genome packaging and function.

  15. Functional Genomics Reveals That a Compact Terpene Synthase Gene Family Can Account for Terpene Volatile Production in Apple1[W

    Science.gov (United States)

    Nieuwenhuizen, Niels J.; Green, Sol A.; Chen, Xiuyin; Bailleul, Estelle J.D.; Matich, Adam J.; Wang, Mindy Y.; Atkinson, Ross G.

    2013-01-01

    Terpenes are specialized plant metabolites that act as attractants to pollinators and as defensive compounds against pathogens and herbivores, but they also play an important role in determining the quality of horticultural food products. We show that the genome of cultivated apple (Malus domestica) contains 55 putative terpene synthase (TPS) genes, of which only 10 are predicted to be functional. This low number of predicted functional TPS genes compared with other plant species was supported by the identification of only eight potentially functional TPS enzymes in apple ‘Royal Gala’ expressed sequence tag databases, including the previously characterized apple (E,E)-α-farnesene synthase. In planta functional characterization of these TPS enzymes showed that they could account for the majority of terpene volatiles produced in cv Royal Gala, including the sesquiterpenes germacrene-D and (E)-β-caryophyllene, the monoterpenes linalool and α-pinene, and the homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene. Relative expression analysis of the TPS genes indicated that floral and vegetative tissues were the primary sites of terpene production in cv Royal Gala. However, production of cv Royal Gala floral-specific terpenes and TPS genes was observed in the fruit of some heritage apple cultivars. Our results suggest that the apple TPS gene family has been shaped by a combination of ancestral and more recent genome-wide duplication events. The relatively small number of functional enzymes suggests that the remaining terpenes produced in floral and vegetative and fruit tissues are maintained under a positive selective pressure, while the small number of terpenes found in the fruit of modern cultivars may be related to commercial breeding strategies. PMID:23256150

  16. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome.

    Science.gov (United States)

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci.

  17. Codanin-1, mutated in the anaemic disease CDAI, regulates Asf1 function in S-phase histone supply

    DEFF Research Database (Denmark)

    Ask, Katrine; Jasencakova, Zusana; Menard, Patrice

    2012-01-01

    Efficient supply of new histones during DNA replication is critical to restore chromatin organization and maintain genome function. The histone chaperone anti-silencing function 1 (Asf1) serves a key function in providing H3.1-H4 to CAF-1 for replication-coupled nucleosome assembly. We identify C...

  18. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C.P.; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C.; Laukaitis, Christina M.; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A.; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J.; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M.; Odom, Duncan T.; Flicek, Paul

    2018-01-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. PMID:29563166

  19. Pdsg1 and Pdsg2, novel proteins involved in developmental genome remodelling in Paramecium.

    Directory of Open Access Journals (Sweden)

    Miroslav Arambasic

    Full Text Available The epigenetic influence of maternal cells on the development of their progeny has long been studied in various eukaryotes. Multicellular organisms usually provide their zygotes not only with nutrients but also with functional elements required for proper development, such as coding and non-coding RNAs. These maternally deposited RNAs exhibit a variety of functions, from regulating gene expression to assuring genome integrity. In ciliates, such as Paramecium these RNAs participate in the programming of large-scale genome reorganization during development, distinguishing germline-limited DNA, which is excised, from somatic-destined DNA. Only a handful of proteins playing roles in this process have been identified so far, including typical RNAi-derived factors such as Dicer-like and Piwi proteins. Here we report and characterize two novel proteins, Pdsg1 and Pdsg2 (Paramecium protein involved in Development of the Somatic Genome 1 and 2, involved in Paramecium genome reorganization. We show that these proteins are necessary for the excision of germline-limited DNA during development and the survival of sexual progeny. Knockdown of PDSG1 and PDSG2 genes affects the populations of small RNAs known to be involved in the programming of DNA elimination (scanRNAs and iesRNAs and chromatin modification patterns during development. Our results suggest an association between RNA-mediated trans-generational epigenetic signal and chromatin modifications in the process of Paramecium genome reorganization.

  20. Pdsg1 and Pdsg2, novel proteins involved in developmental genome remodelling in Paramecium.

    Science.gov (United States)

    Arambasic, Miroslav; Sandoval, Pamela Y; Hoehener, Cristina; Singh, Aditi; Swart, Estienne C; Nowacki, Mariusz

    2014-01-01

    The epigenetic influence of maternal cells on the development of their progeny has long been studied in various eukaryotes. Multicellular organisms usually provide their zygotes not only with nutrients but also with functional elements required for proper development, such as coding and non-coding RNAs. These maternally deposited RNAs exhibit a variety of functions, from regulating gene expression to assuring genome integrity. In ciliates, such as Paramecium these RNAs participate in the programming of large-scale genome reorganization during development, distinguishing germline-limited DNA, which is excised, from somatic-destined DNA. Only a handful of proteins playing roles in this process have been identified so far, including typical RNAi-derived factors such as Dicer-like and Piwi proteins. Here we report and characterize two novel proteins, Pdsg1 and Pdsg2 (Paramecium protein involved in Development of the Somatic Genome 1 and 2), involved in Paramecium genome reorganization. We show that these proteins are necessary for the excision of germline-limited DNA during development and the survival of sexual progeny. Knockdown of PDSG1 and PDSG2 genes affects the populations of small RNAs known to be involved in the programming of DNA elimination (scanRNAs and iesRNAs) and chromatin modification patterns during development. Our results suggest an association between RNA-mediated trans-generational epigenetic signal and chromatin modifications in the process of Paramecium genome reorganization.

  1. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  2. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

    Science.gov (United States)

    Meinicke, Peter

    2009-09-02

    Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.

  3. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences

    Directory of Open Access Journals (Sweden)

    Meinicke Peter

    2009-09-01

    Full Text Available Abstract Background Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Description Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. Conclusion For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.

  4. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  5. MicrobesOnline: an integrated portal for comparative and functional genomics

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  6. BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics

    DEFF Research Database (Denmark)

    Zhao, Wenming; Wang, Jing; He, Ximiao

    2004-01-01

    Rice is a major food staple for the world's population and serves as a model species in cereal genome research. The Beijing Genomics Institute (BGI) has long been devoting itself to sequencing, information analysis and biological research of the rice and other crop genomes. In order to facilitate....... Designed as a basic platform, BGI-RIS presents the sequenced genomes and related information in systematic and graphical ways for the convenience of in-depth comparative studies (http://rise.genomics.org.cn/). Udgivelsesdato: 2004-Jan-1...

  7. Unleashing the genome of Brassica rapa

    Directory of Open Access Journals (Sweden)

    Haibao eTang

    2012-07-01

    Full Text Available The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with Arabidopsis thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from Arabidopsis thaliana is used to find duplicated orthologs in Brassica rapa. These TOC1 genes are further analyzed to identify conserved noncoding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each 'cookbook style' analysis includes a step-by-step walkthrough with links to CoGe to quickly reproduce each step of the analytical process.

  8. Functionalization of single-walled carbon nanotubes with protein by click chemistry as sensing platform for sensitized electrochemical immunoassay

    International Nuclear Information System (INIS)

    Qi Honglan; Ling Chen; Huang Ru; Qiu Xiaoying; Shangguan Li; Gao Qiang; Zhang Chengxiao

    2012-01-01

    Highlights: ► Single-walled carbon nanotubes were functionalized with protein by click chemistry. ► The SWNTs conjugated with protein showed excellent dispersion in water and kept good bioacitvity. ► A competitive electrochemical immunoassay for the determination of anti-IgG was developed with high sensitivity and good stability. - Abstract: The application of the Cu(I)-catalyzed [3 + 2] Huisgen cycloaddition to the functionalization of single-walled carbon nanotubes (SWNTs) with the protein and the use of the artificial SWNTs as a sensing platform for sensitive immunoassay were reported. Covalent functionalization of azide decorated SWNTs with alkyne modified protein was firstly accomplished by the Cu(I)-catalyzed [3 + 2] Huisgen cycloaddition. FT-IR spectroscopy, Raman spectroscopy, X-ray photoelectron spectroscopy, scanning electron microscopy and transmission electron micrograph were used to characterize the protein-functionalized SWNTs. It was found that the SWNTs conjugated with the proteins showed excellent dispersion in water and kept good bioacitivity when immunoglobulin (IgG) and horseradish peroxidase (HRP) were chosen as model proteins. As a proof-of-concept, IgG-functionalized SWNTs were immobilized onto the surface of a glassy carbon electrode by simple casting method as immunosensing platform and a sensitive competitive electrochemical immunoassay was developed for the determination of anti-immunoglobulin (anti-IgG) using HRP as enzyme label. The fabrication of the immunosensor were characterized by cyclic voltammetry and electrochemical impedance spectroscopy with the redox probe [Fe(CN) 6 ] 3−/4− . The SWNTs as immobilization platform showed better sensitizing effect, a detection limit of 30 pg mL −1 (S/N = 3) was obtained for anti-IgG. The proposed strategy provided a stable immobilization method and sensitized recognition platform for analytes. This work demonstrated that the click coupling of SWNTs with protein was an effective

  9. Controversy and debate on clinical genomics sequencing-paper 1: genomics is not exceptional: rigorous evaluations are necessary for clinical applications of genomic sequencing.

    Science.gov (United States)

    Wilson, Brenda J; Miller, Fiona Alice; Rousseau, François

    2017-12-01

    Next generation genomic sequencing (NGS) technologies-whole genome and whole exome sequencing-are now cheap enough to be within the grasp of many health care organizations. To many, NGS is symbolic of cutting edge health care, offering the promise of "precision" and "personalized" medicine. Historically, research and clinical application has been a two-way street in clinical genetics: research often driven directly by the desire to understand and try to solve immediate clinical problems affecting real, identifiable patients and families, accompanied by a low threshold of willingness to apply research-driven interventions without resort to formal empirical evaluations. However, NGS technologies are not simple substitutes for older technologies and need careful evaluation for use as screening, diagnostic, or prognostic tools. We have concerns across three areas. First, at the moment, analytic validity is unknown because technical platforms are not yet stable, laboratory quality assurance programs are in their infancy, and data interpretation capabilities are badly underdeveloped. Second, clinical validity of genomic findings for patient populations without pre-existing high genetic risk is doubtful, as most clinical experience with NGS technologies relates to patients with a high prior likelihood of a genetic etiology. Finally, we are concerned that proponents argue not only for clinically driven approaches to assessing a patient's genome, but also for seeking out variants associated with unrelated conditions or susceptibilities-so-called "secondary targets"-this is screening on a genomic scale. We argue that clinical uses of genomic sequencing should remain limited to specialist and research settings, that screening for secondary findings in clinical testing should be limited to the maximum extent possible, and that the benefits, harms, and economic implications of their routine use be systematically evaluated. All stakeholders have a responsibility to ensure that

  10. Functional food ingredients against colorectal cancer. An example project integrating functional genomics, nutrition and health

    NARCIS (Netherlands)

    Stierum, R.; Burgemeister, R.; Helvoort, van A.; Peijnenburg, A.; Schütze, K.; Seidelin, M.; Vang, O.; Ommen, van B.

    2001-01-01

    Functional Food Ingredients Against Colorectal Cancer is one of the first European Union funded Research Projects at the cross-road of functional genomics [comprising transcriptomics, the measurement of the expression of all messengers RNA (mRNAs) and proteomics, the measurement of expression/state

  11. Platform Constellations

    DEFF Research Database (Denmark)

    Staykova, Kalina Stefanova; Damsgaard, Jan

    2016-01-01

    This research paper presents an initial attempt to introduce and explain the emergence of new phenomenon, which we refer to as platform constellations. Functioning as highly modular systems, the platform constellations are collections of highly connected platforms which co-exist in parallel and a......’ acquisition and users’ engagement rates as well as unlock new sources of value creation and diversify revenue streams....

  12. Multi-function microfluidic platform for sensor integration.

    Science.gov (United States)

    Fernandes, Ana C; Semenova, Daria; Panjan, Peter; Sesay, Adama M; Gernaey, Krist V; Krühne, Ulrich

    2018-03-06

    The limited availability of metabolite-specific sensors for continuous sampling and monitoring is one of the main bottlenecks contributing to failures in bioprocess development. Furthermore, only a limited number of approaches exist to connect currently available measurement systems with high throughput reactor units. This is especially relevant in the biocatalyst screening and characterization stage of process development. In this work, a strategy for sensor integration in microfluidic platforms is demonstrated, to address the need for rapid, cost-effective and high-throughput screening in bioprocesses. This platform is compatible with different sensor formats by enabling their replacement and was built in order to be highly flexible and thus suitable for a wide range of applications. Moreover, this re-usable platform can easily be connected to analytical equipment, such as HPLC, laboratory scale reactors or other microfluidic chips through the use of standardized fittings. In addition, the developed platform includes a two-sensor system interspersed with a mixing channel, which allows the detection of samples that might be outside the first sensor's range of detection, through dilution of the sample solution up to 10 times. In order to highlight the features of the proposed platform, inline monitoring of glucose levels is presented and discussed. Glucose was chosen due to its importance in biotechnology as a relevant substrate. The platform demonstrated continuous measurement of substrate solutions for up to 12 h. Furthermore, the influence of the fluid velocity on substrate diffusion was observed, indicating the need for in-flow calibration to achieve a good quantitative output. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. SpirPep: an in silico digestion-based platform to assist bioactive peptides discovery from a genome-wide database.

    Science.gov (United States)

    Anekthanakul, Krittima; Hongsthong, Apiradee; Senachak, Jittisak; Ruengjitchatchawalya, Marasri

    2018-04-20

    Bioactive peptides, including biological sources-derived peptides with different biological activities, are protein fragments that influence the functions or conditions of organisms, in particular humans and animals. Conventional methods of identifying bioactive peptides are time-consuming and costly. To quicken the processes, several bioinformatics tools are recently used to facilitate screening of the potential peptides prior their activity assessment in vitro and/or in vivo. In this study, we developed an efficient computational method, SpirPep, which offers many advantages over the currently available tools. The SpirPep web application tool is a one-stop analysis and visualization facility to assist bioactive peptide discovery. The tool is equipped with 15 customized enzymes and 1-3 miscleavage options, which allows in silico digestion of protein sequences encoded by protein-coding genes from single, multiple, or genome-wide scaling, and then directly classifies the peptides by bioactivity using an in-house database that contains bioactive peptides collected from 13 public databases. With this tool, the resulting peptides are categorized by each selected enzyme, and shown in a tabular format where the peptide sequences can be tracked back to their original proteins. The developed tool and webpages are coded in PHP and HTML with CSS/JavaScript. Moreover, the tool allows protein-peptide alignment visualization by Generic Genome Browser (GBrowse) to display the region and details of the proteins and peptides within each parameter, while considering digestion design for the desirable bioactivity. SpirPep is efficient; it takes less than 20 min to digest 3000 proteins (751,860 amino acids) with 15 enzymes and three miscleavages for each enzyme, and only a few seconds for single enzyme digestion. Obviously, the tool identified more bioactive peptides than that of the benchmarked tool; an example of validated pentapeptide (FLPIL) from LC-MS/MS was demonstrated. The

  14. Why close a bacterial genome? The plasmid of Alteromonas macleodii HOT1A3 is a vector for inter-specific transfer of a flexible genomic island

    Directory of Open Access Journals (Sweden)

    Eduard eFadeev

    2016-03-01

    Full Text Available Genome sequencing is rapidly becoming a staple technique in environmental and clinical microbiology, yet computational challenges still remain, leading to many draft genomes which are typically fragmented into many contigs. We sequenced and completely assembled the genome of a marine heterotrophic bacterium, Alteromonas macleodii HOT1A3, and compared its full genome to several draft genomes obtained using different reference-based and de-novo methods. In general, the de-novo assemblies clearly outperformed the reference-based or hybrid ones, covering>99% of the genes and representing essentially all of the gene functions. However, only the fully closed genome (~4.5Mbp allowed us to identify the presence of a large, 148 kbp plasmid, pAM1A3. While HOT1A3 belongs to Alteromonas macleodii, typically found in surface waters (surface ecotype, this plasmid consists of an almost complete flexible genomic island, containing many genes involved in metal resistance previously identified in the genomes of Alteromonas mediterranea (deep ecotype. Indeed, similar to A. mediterranea, A. macleodii HOT1A3 grows at concentrations of zinc, mercury and copper that are inhibitory for other A. macleodii strains. The presence of a plasmid encoding almost an entire flexible genomic island suggests that wholesale genomic exchange between heterotrophic marine bacteria belonging to related but ecologically different populations is not uncommon.

  15. Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

    Science.gov (United States)

    Xu, Kelin; Jin, Li; Xiong, Momiao

    2017-05-18

    Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction

  16. Genome-wide and phase-specific DNA-binding rhythms of BMAL1 control circadian output functions in mouse liver.

    Directory of Open Access Journals (Sweden)

    Guillaume Rey

    2011-02-01

    Full Text Available The mammalian circadian clock uses interlocked negative feedback loops in which the heterodimeric basic helix-loop-helix transcription factor BMAL1/CLOCK is a master regulator. While there is prominent control of liver functions by the circadian clock, the detailed links between circadian regulators and downstream targets are poorly known. Using chromatin immunoprecipitation combined with deep sequencing we obtained a time-resolved and genome-wide map of BMAL1 binding in mouse liver, which allowed us to identify over 2,000 binding sites, with peak binding narrowly centered around Zeitgeber time 6. Annotation of BMAL1 targets confirms carbohydrate and lipid metabolism as the major output of the circadian clock in mouse liver. Moreover, transcription regulators are largely overrepresented, several of which also exhibit circadian activity. Genes of the core circadian oscillator stand out as strongly bound, often at promoter and distal sites. Genomic sequence analysis of the sites identified E-boxes and tandem E1-E2 consensus elements. Electromobility shift assays showed that E1-E2 sites are bound by a dimer of BMAL1/CLOCK heterodimers with a spacing-dependent cooperative interaction, a finding that was further validated in transactivation assays. BMAL1 target genes showed cyclic mRNA expression profiles with a phase distribution centered at Zeitgeber time 10. Importantly, sites with E1-E2 elements showed tighter phases both in binding and mRNA accumulation. Finally, analyzing the temporal profiles of BMAL1 binding, precursor mRNA and mature mRNA levels showed how transcriptional and post-transcriptional regulation contribute differentially to circadian expression phase. Together, our analysis of a dynamic protein-DNA interactome uncovered how genes of the core circadian oscillator crosstalk and drive phase-specific circadian output programs in a complex tissue.

  17. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

    Science.gov (United States)

    Thybert, David; Roller, Maša; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul

    2018-04-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. © 2018 Thybert et al.; Published by Cold Spring Harbor Laboratory Press.

  18. Targeted viral-mediated plant genome editing using crispr/cas9

    KAUST Repository

    Mahfouz, Magdy M.; Ali, Zahir

    2015-01-01

    The present disclosure provides a viral-mediated genome-editing platform that facilitates multiplexing, obviates stable transformation, and is applicable across plant species. The RNA2 genome of the tobacco rattle virus (TRV) was engineered to carry and systemically deliver a guide RNA molecules into plants overexpressing Cas9 endonuclease. High genomic modification frequencies were observed in inoculated as well as systemic leaves including the plant growing points. This system facilitates multiplexing and can lead to germinal transmission of the genomic modifications in the progeny, thereby obviating the requirements of repeated transformations and tissue culture. The editing platform of the disclosure is useful in plant genome engineering and applicable across plant species amenable to viral infections for agricultural biotechnology applications.

  19. Targeted viral-mediated plant genome editing using crispr/cas9

    KAUST Repository

    Mahfouz, Magdy M.

    2015-12-17

    The present disclosure provides a viral-mediated genome-editing platform that facilitates multiplexing, obviates stable transformation, and is applicable across plant species. The RNA2 genome of the tobacco rattle virus (TRV) was engineered to carry and systemically deliver a guide RNA molecules into plants overexpressing Cas9 endonuclease. High genomic modification frequencies were observed in inoculated as well as systemic leaves including the plant growing points. This system facilitates multiplexing and can lead to germinal transmission of the genomic modifications in the progeny, thereby obviating the requirements of repeated transformations and tissue culture. The editing platform of the disclosure is useful in plant genome engineering and applicable across plant species amenable to viral infections for agricultural biotechnology applications.

  20. Colony size measurement of the yeast gene deletion strains for functional genomics

    Directory of Open Access Journals (Sweden)

    Mir-Rashed Nadereh

    2007-04-01

    Full Text Available Abstract Background Numerous functional genomics approaches have been developed to study the model organism yeast, Saccharomyces cerevisiae, with the aim of systematically understanding the biology of the cell. Some of these techniques are based on yeast growth differences under different conditions, such as those generated by gene mutations, chemicals or both. Manual inspection of the yeast colonies that are grown under different conditions is often used as a method to detect such growth differences. Results Here, we developed a computerized image analysis system called Growth Detector (GD, to automatically acquire quantitative and comparative information for yeast colony growth. GD offers great convenience and accuracy over the currently used manual growth measurement method. It distinguishes true yeast colonies in a digital image and provides an accurate coordinate oriented map of the colony areas. Some post-processing calculations are also conducted. Using GD, we successfully detected a genetic linkage between the molecular activity of the plant-derived antifungal compound berberine and gene expression components, among other cellular processes. A novel association for the yeast mek1 gene with DNA damage repair was also identified by GD and confirmed by a plasmid repair assay. The results demonstrate the usefulness of GD for yeast functional genomics research. Conclusion GD offers significant improvement over the manual inspection method to detect relative yeast colony size differences. The speed and accuracy associated with GD makes it an ideal choice for large-scale functional genomics investigations.

  1. Cross-family translational genomics of abiotic stress-responsive genes between Arabidopsis and Medicago truncatula.

    Directory of Open Access Journals (Sweden)

    Daejin Hyung

    Full Text Available Cross-species translation of genomic information may play a pivotal role in applying biological knowledge gained from relatively simple model system to other less studied, but related, genomes. The information of abiotic stress (ABS-responsive genes in Arabidopsis was identified and translated into the legume model system, Medicago truncatula. Various data resources, such as TAIR/AtGI DB, expression profiles and literatures, were used to build a genome-wide list of ABS genes. tBlastX/BlastP similarity search tools and manual inspection of alignments were used to identify orthologous genes between the two genomes. A total of 1,377 genes were finally collected and classified into 18 functional criteria of gene ontology (GO. The data analysis according to the expression cues showed that there was substantial level of interaction among three major types (i.e., drought, salinity and cold stress of abiotic stresses. In an attempt to translate the ABS genes between these two species, genomic locations for each gene were mapped using an in-house-developed comparative analysis platform. The comparative analysis revealed that fragmental colinearity, represented by only 37 synteny blocks, existed between Arabidopsis and M. truncatula. Based on the combination of E-value and alignment remarks, estimated translation rate was 60.2% for this cross-family translation. As a prelude of the functional comparative genomic approaches, in-silico gene network/interactome analyses were conducted to predict key components in the ABS responses, and one of the sub-networks was integrated with corresponding comparative map. The results demonstrated that core members of the sub-network were well aligned with previously reported ABS regulatory networks. Taken together, the results indicate that network-based integrative approaches of comparative and functional genomics are important to interpret and translate genomic information for complex traits such as abiotic stresses.

  2. Functional RNA structures throughout the Hepatitis C Virus genome.

    Science.gov (United States)

    Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie

    2017-06-01

    The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. High-throughput sequencing of three Lemnoideae (duckweeds chloroplast genomes from total DNA.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. METHODS: We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. CONCLUSIONS: This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.

  4. Endocrine-Therapy-Resistant ESR1 Variants Revealed by Genomic Characterization of Breast-Cancer-Derived Xenografts

    Directory of Open Access Journals (Sweden)

    Shunqiang Li

    2013-09-01

    Full Text Available To characterize patient-derived xenografts (PDXs for functional studies, we made whole-genome comparisons with originating breast cancers representative of the major intrinsic subtypes. Structural and copy number aberrations were found to be retained with high fidelity. However, at the single-nucleotide level, variable numbers of PDX-specific somatic events were documented, although they were only rarely functionally significant. Variant allele frequencies were often preserved in the PDXs, demonstrating that clonal representation can be transplantable. Estrogen-receptor-positive PDXs were associated with ESR1 ligand-binding-domain mutations, gene amplification, or an ESR1/YAP1 translocation. These events produced different endocrine-therapy-response phenotypes in human, cell line, and PDX endocrine-response studies. Hence, deeply sequenced PDX models are an important resource for the search for genome-forward treatment options and capture endocrine-drug-resistance etiologies that are not observed in standard cell lines. The originating tumor genome provides a benchmark for assessing genetic drift and clonal representation after transplantation.

  5. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle; Stephens, Timothy G.; Gonzá lez-Pech, Raú l; Beltran, Victor H.; Lapeyre, Bruno; Bongaerts, Pim; Cooke, Ira; Bourne, David G.; Forê t, Sylvain; Miller, David John; van Oppen, Madeleine J. H.; Voolstra, Christian R.; Ragan, Mark A.; Chan, Cheong Xin

    2017-01-01

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world's coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  6. Symbiodinium genomes reveal adaptive evolution of functions related to symbiosis

    KAUST Repository

    Liu, Huanle

    2017-10-06

    Symbiosis between dinoflagellates of the genus Symbiodinium and reef-building corals forms the trophic foundation of the world\\'s coral reef ecosystems. Here we present the first draft genome of Symbiodinium goreaui (Clade C, type C1: 1.03 Gbp), one of the most ubiquitous endosymbionts associated with corals, and an improved draft genome of Symbiodinium kawagutii (Clade F, strain CS-156: 1.05 Gbp), previously sequenced as strain CCMP2468, to further elucidate genomic signatures of this symbiosis. Comparative analysis of four available Symbiodinium genomes against other dinoflagellate genomes led to the identification of 2460 nuclear gene families that show evidence of positive selection, including genes involved in photosynthesis, transmembrane ion transport, synthesis and modification of amino acids and glycoproteins, and stress response. Further, we identified extensive sets of genes for meiosis and response to light stress. These draft genomes provide a foundational resource for advancing our understanding Symbiodinium biology and the coral-algal symbiosis.

  7. Suicidal function of DNA methylation in age-related genome disintegration.

    Science.gov (United States)

    Mazin, Alexander L

    2009-10-01

    This article is dedicated to the 60th anniversary of 5-methylcytosine discovery in DNA. Cytosine methylation can affect genetic and epigenetic processes, works as a part of the genome-defense system and has mutagenic activity; however, the biological functions of this enzymatic modification are not well understood. This review will put forward the hypothesis that the host-defense role of DNA methylation in silencing and mutational destroying of retroviruses and other intragenomic parasites was extended during evolution to most host genes that have to be inactivated in differentiated somatic cells, where it acquired a new function in age-related self-destruction of the genome. The proposed model considers DNA methylation as the generator of 5mC>T transitions that induce 40-70% of all spontaneous somatic mutations of the multiple classes at CpG and CpNpG sites and flanking nucleotides in the p53, FIX, hprt, gpt human genes and some transgenes. The accumulation of 5mC-dependent mutations explains: global changes in the structure of the vertebrate genome throughout evolution; the loss of most 5mC from the DNA of various species over their lifespan and the Hayflick limit of normal cells; the polymorphism of methylation sites, including asymmetric mCpNpN sites; cyclical changes of methylation and demethylation in genes. The suicidal function of methylation may be a special genetic mechanism for increasing DNA damage and the programmed genome disintegration responsible for cell apoptosis and organism aging and death.

  8. Proteus genomic island 1 (PGI1), a new resistance genomic island from two Proteus mirabilis French clinical isolates.

    Science.gov (United States)

    Siebor, Eliane; Neuwirth, Catherine

    2014-12-01

    To analyse the genetic environment of the antibiotic resistance genes in two clinical Proteus mirabilis isolates resistant to multiple antibiotics. PCR, gene walking and whole-genome sequencing were used to determine the sequence of the resistance regions, the surrounding genetic structure and the flanking chromosomal regions. A genomic island of 81.1 kb named Proteus genomic island 1 (PGI1) located at the 3'-end of trmE (formerly known as thdF) was characterized. The large MDR region of PGI1 (55.4 kb) included a class 1 integron (aadB and aadA2) and regions deriving from several transposons: Tn2 (blaTEM-135), Tn21, Tn6020-like transposon (aphA1b), a hybrid Tn502/Tn5053 transposon, Tn501, a hybrid Tn1696/Tn1721 transposon [tetA(A)] carrying a class 1 integron (aadA1) and Tn5393 (strA and strB). Several ISs were also present (IS4321, IS1R and IS26). The PGI1 backbone (25.7 kb) was identical to that identified in Salmonella Heidelberg SL476 and shared some identity with the Salmonella genomic island 1 (SGI1) backbone. An IS26-mediated recombination event caused the division of the MDR region into two parts separated by a large chromosomal DNA fragment of 197 kb, the right end of PGI1 and this chromosomal sequence being in inverse orientation. PGI1 is a new resistance genomic island from P. mirabilis belonging to the same island family as SGI1. The role of PGI1 in the spread of antimicrobial resistance genes among Enterobacteriaceae of medical importance needs to be evaluated. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Smart Aging Platform for Evaluating Cognitive Functions in Aging: A Comparison with the MoCA in a Normal Population

    Directory of Open Access Journals (Sweden)

    Sara Bottiroli

    2017-11-01

    Full Text Available Background: Smart Aging is a Serious games (SGs platform in a 3D virtual environment in which users perform a set of screening tests that address various cognitive skills. The tests are structured as 5 tasks of activities of daily life in a familiar environment. The main goal of the present study is to compare a cognitive evaluation made with Smart Aging with those of a classic standardized screening test, the Montreal Cognitive Assessment (MoCA.Methods: One thousand one-hundred thirty-one healthy adults aged between 50 and 80 (M = 64.3 ± 8.3 were enrolled in the study. They received a cognitive evaluation with the MoCA and the Smart Aging platform. Participants were grouped according to their MoCA global and specific cognitive domain (i.e., memory, executive functions, working memory, visual spatial elaboration, language, and orientation scores and we explored differences among these groups in the Smart Aging indices.Results: One thousand eighty-six older adults (M = 64.0 ± 8.0 successfully completed the study and were stratified according to their MoCA score: Group 1 with MoCA < 27 (n = 360; Group 2 with 27 ≥ MoCA < 29 (n = 453; and Group 3 with MoCA ≥ 29 (n = 273. MoCA groups significantly differed in most of the Smart Aging indices considered, in particular as concerns accuracy (ps < 0.001 and time (ps < 0.001 for completing most of the platform tasks. Group 1 was outperformed by the other two Groups and was slower than them in these tasks, which were those supposed to assess memory and executive functions. In addition, significant differences across groups also emerged when considering the single cognitive domains of the MoCA and the corresponding performances in each Smart Aging task. In particular, this platform seems to be a good proxy for assessing memory, executive functions, working memory, and visual spatial processes.Conclusion: These findings demonstrate the validity of Smart Aging for assessing cognitive functions in normal

  10. Characterization of Aeromonas hydrophila wound pathotypes by comparative genomic and functional analyses of virulence genes.

    Science.gov (United States)

    Grim, Christopher J; Kozlova, Elena V; Sha, Jian; Fitts, Eric C; van Lier, Christina J; Kirtley, Michelle L; Joseph, Sandeep J; Read, Timothy D; Burd, Eileen M; Tall, Ben D; Joseph, Sam W; Horneman, Amy J; Chopra, Ashok K; Shak, Joshua R

    2013-04-23

    Aeromonas hydrophila has increasingly been implicated as a virulent and antibiotic-resistant etiologic agent in various human diseases. In a previously published case report, we described a subject with a polymicrobial wound infection that included a persistent and aggressive strain of A. hydrophila (E1), as well as a more antibiotic-resistant strain of A. hydrophila (E2). To better understand the differences between pathogenic and environmental strains of A. hydrophila, we conducted comparative genomic and functional analyses of virulence-associated genes of these two wound isolates (E1 and E2), the environmental type strain A. hydrophila ATCC 7966(T), and four other isolates belonging to A. aquariorum, A. veronii, A. salmonicida, and A. caviae. Full-genome sequencing of strains E1 and E2 revealed extensive differences between the two and strain ATCC 7966(T). The more persistent wound infection strain, E1, harbored coding sequences for a cytotoxic enterotoxin (Act), a type 3 secretion system (T3SS), flagella, hemolysins, and a homolog of exotoxin A found in Pseudomonas aeruginosa. Corresponding phenotypic analyses with A. hydrophila ATCC 7966(T) and SSU as reference strains demonstrated the functionality of these virulence genes, with strain E1 displaying enhanced swimming and swarming motility, lateral flagella on electron microscopy, the presence of T3SS effector AexU, and enhanced lethality in a mouse model of Aeromonas infection. By combining sequence-based analysis and functional assays, we characterized an A. hydrophila pathotype, exemplified by strain E1, that exhibited increased virulence in a mouse model of infection, likely because of encapsulation, enhanced motility, toxin secretion, and cellular toxicity. Aeromonas hydrophila is a common aquatic bacterium that has increasingly been implicated in serious human infections. While many determinants of virulence have been identified in Aeromonas, rapid identification of pathogenic versus nonpathogenic

  11. Functional genomics in the study of mind-body therapies.

    Science.gov (United States)

    Niles, Halsey; Mehta, Darshan H; Corrigan, Alexandra A; Bhasin, Manoj K; Denninger, John W

    2014-01-01

    Mind-body therapies (MBTs) are used throughout the world in treatment, disease prevention, and health promotion. However, the mechanisms by which MBTs exert their positive effects are not well understood. Investigations into MBTs using functional genomics have revolutionized the understanding of MBT mechanisms and their effects on human physiology. We searched the literature for the effects of MBTs on functional genomics determinants using MEDLINE, supplemented by a manual search of additional journals and a reference list review. We reviewed 15 trials that measured global or targeted transcriptomic, epigenomic, or proteomic changes in peripheral blood. Sample sizes ranged from small pilot studies (n=2) to large trials (n=500). While the reliability of individual genes from trial to trial was often inconsistent, genes related to inflammatory response, particularly those involved in the nuclear factor-kappa B (NF-κB) pathway, were consistently downregulated across most studies. In general, existing trials focusing on gene expression changes brought about by MBTs have revealed intriguing connections to the immune system through the NF-κB cascade, to telomere maintenance, and to apoptotic regulation. However, these findings are limited to a small number of trials and relatively small sample sizes. More rigorous randomized controlled trials of healthy subjects and specific disease states are warranted. Future research should investigate functional genomics areas both upstream and downstream of MBT-related gene expression changes-from epigenomics to proteomics and metabolomics.

  12. Genome-wide identification, functional and evolutionary analysis of terpene synthases in pineapple.

    Science.gov (United States)

    Chen, Xiaoe; Yang, Wei; Zhang, Liqin; Wu, Xianmiao; Cheng, Tian; Li, Guanglin

    2017-10-01

    Terpene synthases (TPSs) are vital for the biosynthesis of active terpenoids, which have important physiological, ecological and medicinal value. Although terpenoids have been reported in pineapple (Ananas comosus), genome-wide investigations of the TPS genes responsible for pineapple terpenoid synthesis are still lacking. By integrating pineapple genome and proteome data, twenty-one putative terpene synthase genes were found in pineapple and divided into five subfamilies. Tandem duplication is the cause of TPS gene family duplication. Furthermore, functional differentiation between each TPS subfamily may have occurred for several reasons. Sixty-two key amino acid sites were identified as being type-II functionally divergence between TPS-a and TPS-c subfamily. Finally, coevolution analysis indicated that multiple amino acid residues are involved in coevolutionary processes. In addition, the enzyme activity of two TPSs were tested. This genome-wide identification, functional and evolutionary analysis of pineapple TPS genes provide a new insight into understanding the roles of TPS family and lay the basis for further characterizing the function and evolution of TPS gene family. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. W-curve alignments for HIV-1 genomic comparisons.

    Directory of Open Access Journals (Sweden)

    Douglas J Cork

    2010-06-01

    Full Text Available The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly.We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison.The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE.Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison

  14. W-curve alignments for HIV-1 genomic comparisons.

    Science.gov (United States)

    Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

    2010-06-01

    The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of

  15. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    Science.gov (United States)

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement

    Directory of Open Access Journals (Sweden)

    Arun Prabhu Dhanapal

    2015-01-01

    Full Text Available The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away.

  17. Genic intolerance to functional variation and the interpretation of personal genomes.

    Directory of Open Access Journals (Sweden)

    Slavé Petrovski

    Full Text Available A central challenge in interpreting personal genomes is determining which mutations most likely influence disease. Although progress has been made in scoring the functional impact of individual mutations, the characteristics of the genes in which those mutations are found remain largely unexplored. For example, genes known to carry few common functional variants in healthy individuals may be judged more likely to cause certain kinds of disease than genes known to carry many such variants. Until now, however, it has not been possible to develop a quantitative assessment of how well genes tolerate functional genetic variation on a genome-wide scale. Here we describe an effort that uses sequence data from 6503 whole exome sequences made available by the NHLBI Exome Sequencing Project (ESP. Specifically, we develop an intolerance scoring system that assesses whether genes have relatively more or less functional genetic variation than expected based on the apparently neutral variation found in the gene. To illustrate the utility of this intolerance score, we show that genes responsible for Mendelian diseases are significantly more intolerant to functional genetic variation than genes that do not cause any known disease, but with striking variation in intolerance among genes causing different classes of genetic disease. We conclude by showing that use of an intolerance ranking system can aid in interpreting personal genomes and identifying pathogenic mutations.

  18. Introducing Platform Interactions Model for Studying Multi-Sided Platforms

    DEFF Research Database (Denmark)

    Staykova, Kalina; Damsgaard, Jan

    2018-01-01

    Multi-Sided Platforms (MSPs) function as socio-technical entities that facilitate direct interactions between various affiliated to them constituencies through developing and managing IT architecture. In this paper, we aim to explain the nature of the platform interactions as key characteristic o...

  19. Genome Wide Re-Annotation of Caldicellulosiruptor saccharolyticus with New Insights into Genes Involved in Biomass Degradation and Hydrogen Production.

    Science.gov (United States)

    Chowdhary, Nupoor; Selvaraj, Ashok; KrishnaKumaar, Lakshmi; Kumar, Gopal Ramesh

    2015-01-01

    Caldicellulosiruptor saccharolyticus has proven itself to be an excellent candidate for biological hydrogen (H2) production, but still it has major drawbacks like sensitivity to high osmotic pressure and low volumetric H2 productivity, which should be considered before it can be used industrially. A whole genome re-annotation work has been carried out as an attempt to update the incomplete genome information that causes gap in the knowledge especially in the area of metabolic engineering, to improve the H2 producing capabilities of C. saccharolyticus. Whole genome re-annotation was performed through manual means for 2,682 Coding Sequences (CDSs). Bioinformatics tools based on sequence similarity, motif search, phylogenetic analysis and fold recognition were employed for re-annotation. Our methodology could successfully add functions for 409 hypothetical proteins (HPs), 46 proteins previously annotated as putative and assigned more accurate functions for the known protein sequences. Homology based gene annotation has been used as a standard method for assigning function to novel proteins, but over the past few years many non-homology based methods such as genomic context approaches for protein function prediction have been developed. Using non-homology based functional prediction methods, we were able to assign cellular processes or physical complexes for 249 hypothetical sequences. Our re-annotation pipeline highlights the addition of 231 new CDSs generated from MicroScope Platform, to the original genome with functional prediction for 49 of them. The re-annotation of HPs and new CDSs is stored in the relational database that is available on the MicroScope web-based platform. In parallel, a comparative genome analyses were performed among the members of genus Caldicellulosiruptor to understand the function and evolutionary processes. Further, with results from integrated re-annotation studies (homology and genomic context approach), we strongly suggest that Csac

  20. Genome Wide Re-Annotation of Caldicellulosiruptor saccharolyticus with New Insights into Genes Involved in Biomass Degradation and Hydrogen Production.

    Directory of Open Access Journals (Sweden)

    Nupoor Chowdhary

    Full Text Available Caldicellulosiruptor saccharolyticus has proven itself to be an excellent candidate for biological hydrogen (H2 production, but still it has major drawbacks like sensitivity to high osmotic pressure and low volumetric H2 productivity, which should be considered before it can be used industrially. A whole genome re-annotation work has been carried out as an attempt to update the incomplete genome information that causes gap in the knowledge especially in the area of metabolic engineering, to improve the H2 producing capabilities of C. saccharolyticus. Whole genome re-annotation was performed through manual means for 2,682 Coding Sequences (CDSs. Bioinformatics tools based on sequence similarity, motif search, phylogenetic analysis and fold recognition were employed for re-annotation. Our methodology could successfully add functions for 409 hypothetical proteins (HPs, 46 proteins previously annotated as putative and assigned more accurate functions for the known protein sequences. Homology based gene annotation has been used as a standard method for assigning function to novel proteins, but over the past few years many non-homology based methods such as genomic context approaches for protein function prediction have been developed. Using non-homology based functional prediction methods, we were able to assign cellular processes or physical complexes for 249 hypothetical sequences. Our re-annotation pipeline highlights the addition of 231 new CDSs generated from MicroScope Platform, to the original genome with functional prediction for 49 of them. The re-annotation of HPs and new CDSs is stored in the relational database that is available on the MicroScope web-based platform. In parallel, a comparative genome analyses were performed among the members of genus Caldicellulosiruptor to understand the function and evolutionary processes. Further, with results from integrated re-annotation studies (homology and genomic context approach, we strongly

  1. [Compartmentalization of the cell nucleus and spatial organization of the genome].

    Science.gov (United States)

    Gavrilov, A A; Razin, S V

    2015-01-01

    The eukaryotic cell nucleus is one of the most complex cell organelles. Despite the absence of membranes, the nuclear space is divided into numerous compartments where different processes in- volved in the genome activity take place. The most important nuclear compartments include nucleoli, nuclear speckles, PML bodies, Cajal bodies, histone locus bodies, Polycomb bodies, insulator bodies, transcription and replication factories. The structural basis for the nuclear compartmentalization is provided by genomic DNA that occupies most of the nuclear volume. Nuclear compartments, in turn, guide the chromosome folding by providing a platform for the spatial interaction of individual genomic loci. In this review, we discuss fundamental principles of higher order genome organization with a focus on chromosome territories and chromosome domains, as well as consider the structure and function of the key nuclear compartments. We show that the func- tional compartmentalization of the cell nucleus and genome spatial organization are tightly interconnected, and that this form of organization is highly dynamic and is based on stochastic processes.

  2. Significance of functional disease-causal/susceptible variants identified by whole-genome analyses for the understanding of human diseases.

    Science.gov (United States)

    Hitomi, Yuki; Tokunaga, Katsushi

    2017-01-01

    Human genome variation may cause differences in traits and disease risks. Disease-causal/susceptible genes and variants for both common and rare diseases can be detected by comprehensive whole-genome analyses, such as whole-genome sequencing (WGS), using next-generation sequencing (NGS) technology and genome-wide association studies (GWAS). Here, in addition to the application of an NGS as a whole-genome analysis method, we summarize approaches for the identification of functional disease-causal/susceptible variants from abundant genetic variants in the human genome and methods for evaluating their functional effects in human diseases, using an NGS and in silico and in vitro functional analyses. We also discuss the clinical applications of the functional disease causal/susceptible variants to personalized medicine.

  3. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    Science.gov (United States)

    Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  4. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    Directory of Open Access Journals (Sweden)

    John Archer

    Full Text Available HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5 viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences and genotypic (e.g., population sequencing linked to bioinformatic algorithms assays are the most widely used. Although several next-generation sequencing (NGS platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences, Illumina®, and Ion Torrent™ (Life Technologies. Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used, compared to Trofile (80% and population sequencing (70%. In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  5. Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.

    Directory of Open Access Journals (Sweden)

    Kris Popendorf

    Full Text Available BACKGROUND: With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. METHODOLOGY/PRINCIPAL FINDINGS: Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1 adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2 parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow in 21 hours CPU time (42 minutes wall time. This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. CONCLUSIONS/SIGNIFICANCE: Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with

  6. Bat biology, genomes, and the Bat1K project

    DEFF Research Database (Denmark)

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M

    2018-01-01

    and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any...

  7. Event-based text mining for biology and functional genomics

    Science.gov (United States)

    Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B.

    2015-01-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research. PMID:24907365

  8. The integrated microbial genome resource of analysis.

    Science.gov (United States)

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  9. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  10. Brain function in carriers of a genome-wide supported bipolar disorder variant.

    Science.gov (United States)

    Erk, Susanne; Meyer-Lindenberg, Andreas; Schnell, Knut; Opitz von Boberfeld, Carola; Esslinger, Christine; Kirsch, Peter; Grimm, Oliver; Arnold, Claudia; Haddad, Leila; Witt, Stephanie H; Cichon, Sven; Nöthen, Markus M; Rietschel, Marcella; Walter, Henrik

    2010-08-01

    The neural abnormalities underlying genetic risk for bipolar disorder, a severe, common, and highly heritable psychiatric condition, are largely unknown. An opportunity to define these mechanisms is provided by the recent discovery, through genome-wide association, of a single-nucleotide polymorphism (rs1006737) strongly associated with bipolar disorder within the CACNA1C gene, encoding the alpha subunit of the L-type voltage-dependent calcium channel Ca(v)1.2. To determine whether the genetic risk associated with rs1006737 is mediated through hippocampal function. Functional magnetic resonance imaging study. University hospital. A total of 110 healthy volunteers of both sexes and of German descent in the Hardy-Weinberg equilibrium for rs1006737. Blood oxygen level-dependent signal during an episodic memory task and behavioral and psychopathological measures. Using an intermediate phenotype approach, we show that healthy carriers of the CACNA1C risk variant exhibit a pronounced reduction of bilateral hippocampal activation during episodic memory recall and diminished functional coupling between left and right hippocampal regions. Furthermore, risk allele carriers exhibit activation deficits of the subgenual anterior cingulate cortex, a region repeatedly associated with affective disorders and the mediation of adaptive stress-related responses. The relevance of these findings for affective disorders is supported by significantly higher psychopathology scores for depression, anxiety, obsessive-compulsive thoughts, interpersonal sensitivity, and neuroticism in risk allele carriers, correlating negatively with the observed regional brain activation. Our data demonstrate that rs1006737 or genetic variants in linkage disequilibrium with it are functional in the human brain and provide a neurogenetic risk mechanism for bipolar disorder backed by genome-wide evidence.

  11. Development of Highly Informative Genome-Wide Single Sequence Repeat Markers for Breeding Applications in Sesame and Construction of a Web Resource: SisatBase

    Directory of Open Access Journals (Sweden)

    Komivi Dossa

    2017-08-01

    Full Text Available The sequencing of the full nuclear genome of sesame (Sesamum indicum L. provides the platform for functional analyses of genome components and their application in breeding programs. Although the importance of microsatellites markers or simple sequence repeats (SSR in crop genotyping, genetics, and breeding applications is well established, only a little information exist concerning SSRs at the whole genome level in sesame. In addition, SSRs represent a suitable marker type for sesame molecular breeding in developing countries where it is mainly grown. In this study, we identified 138,194 genome-wide SSRs of which 76.5% were physically mapped onto the 13 pseudo-chromosomes. Among these SSRs, up to three primers pairs were supplied for 101,930 SSRs and used to in silico amplify the reference genome together with two newly sequenced sesame accessions. A total of 79,957 SSRs (78% were polymorphic between the three genomes thereby suggesting their promising use in different genomics-assisted breeding applications. From these polymorphic SSRs, 23 were selected and validated to have high polymorphic potential in 48 sesame accessions from different growing areas of Africa. Furthermore, we have developed an online user-friendly database, SisatBase (http://www.sesame-bioinfo.org/SisatBase/, which provides free access to SSRs data as well as an integrated platform for functional analyses. Altogether, the reference SSR and SisatBase would serve as useful resources for genetic assessment, genomic studies, and breeding advancement in sesame, especially in developing countries.

  12. Conservation of Repeats at the Mammalian KCNQ1OT1-CDKN1C Region Suggests a Role in Genomic Imprinting

    Directory of Open Access Journals (Sweden)

    Marcos De Donato

    2017-06-01

    Full Text Available KCNQ1OT1 is located in the region with the highest number of genes showing genomic imprinting, but the mechanisms controlling the genes under its influence have not been fully elucidated. Therefore, we conducted a comparative analysis of the KCNQ1/KCNQ1OT1-CDKN1C region to study its conservation across the best assembled eutherian mammalian genomes sequenced to date and analyzed potential elements that may be implicated in the control of genomic imprinting in this region. The genomic features in these regions from human, mouse, cattle, and dog show a higher number of genes and CpG islands (detected using cpgplot from EMBOSS, but lower number of repetitive elements (including short interspersed nuclear elements and long interspersed nuclear elements, compared with their whole chromosomes (detected by RepeatMasker. The KCNQ1OT1-CDKN1C region contains the highest number of conserved noncoding sequences (CNS among mammals, where we found 16 regions containing about 38 different highly conserved repetitive elements (using mVista, such as LINE1 elements: L1M4, L1MB7, HAL1, L1M4a, L1Med, and an LTR element: MLT1H. From these elements, we found 74 CNS showing high sequence identity (>70% between human, cattle, and mouse, from which we identified 13 motifs (using Multiple Em for Motif Elicitation/Motif Alignment and Search Tool with a significant probability of occurrence, 3 of which were the most frequent and were used to find transcription factor–binding sites. We detected several transcription factors (using JASPAR suite from the families SOX, FOX, and GATA. A phylogenetic analysis of these CNS from human, marmoset, mouse, rat, cattle, dog, horse, and elephant shows branches with high levels of support and very similar phylogenetic relationships among these groups, confirming previous reports. Our results suggest that functional DNA elements identified by comparative genomics in a region densely populated with imprinted mammalian genes may be

  13. MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.

    Science.gov (United States)

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2013-01-01

    The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.

  14. Bordetella pertussis evolution in the (functional) genomics era

    Science.gov (United States)

    Belcher, Thomas; Preston, Andrew

    2015-01-01

    The incidence of whooping cough caused by Bordetella pertussis in many developed countries has risen dramatically in recent years. This has been linked to the use of an acellular pertussis vaccine. In addition, it is thought that B. pertussis is adapting under acellular vaccine mediated immune selection pressure, towards vaccine escape. Genomics-based approaches have revolutionized the ability to resolve the fine structure of the global B. pertussis population and its evolution during the era of vaccination. Here, we discuss the current picture of B. pertussis evolution and diversity in the light of the current resurgence, highlight import questions raised by recent studies in this area and discuss the role that functional genomics can play in addressing current knowledge gaps. PMID:26297914

  15. Multi-function microfluidic platform for sensor integration

    DEFF Research Database (Denmark)

    Fernandes, Ana C.; Semenova, Daria; Panjan, Peter

    2018-01-01

    The limited availability of metabolite-specific sensors for continuous sampling and monitoring is one of the main bottlenecks contributing to failures in bioprocess development. Furthermore, only a limited number of approaches exist to connect currently available measurement systems with high...... throughput reactor units. This is especially relevant in the biocatalyst screening and characterization stage of process development. In this work, a strategy for sensor integration in microfluidic platforms is demonstrated, to address the need for rapid, cost-effective and high-throughput screening...... of the sample solution up to 10 times. In order to highlight the features of the proposed platform, inline monitoring of glucose levels is presented and discussed. Glucose was chosen due to its importance in biotechnology as a relevant substrate. The platform demonstrated continuous measurement of substrate...

  16. From prediction to function using evolutionary genomics: human-specific ecotypes of Lactobacillus reuteri have diverse probiotic functions.

    Science.gov (United States)

    Spinler, Jennifer K; Sontakke, Amrita; Hollister, Emily B; Venable, Susan F; Oh, Phaik Lyn; Balderas, Miriam A; Saulnier, Delphine M A; Mistretta, Toni-Ann; Devaraj, Sridevi; Walter, Jens; Versalovic, James; Highlander, Sarah K

    2014-06-19

    The vertebrate gut symbiont Lactobacillus reuteri has diversified into separate clades reflecting host origin. Strains show evidence of host adaptation, but how host-microbe coevolution influences microbial-derived effects on hosts is poorly understood. Emphasizing human-derived strains of L. reuteri, we combined comparative genomic analyses with functional assays to examine variations in host interaction among genetically distinct ecotypes. Within clade II or VI, the genomes of human-derived L. reuteri strains are highly conserved in gene content and at the nucleotide level. Nevertheless, they share only 70-90% of total gene content, indicating differences in functional capacity. Human-associated lineages are distinguished by genes related to bacteriophages, vitamin biosynthesis, antimicrobial production, and immunomodulation. Differential production of reuterin, histamine, and folate by 23 clade II and VI strains was demonstrated. These strains also differed with respect to their ability to modulate human cytokine production (tumor necrosis factor, monocyte chemoattractant protein-1, interleukin [IL]-1β, IL-5, IL-7, IL-12, and IL-13) by myeloid cells. Microarray analysis of representative clade II and clade VI strains revealed global regulation of genes within the reuterin, vitamin B12, folate, and arginine catabolism gene clusters by the AraC family transcriptional regulator, PocR. Thus, human-derived L. reuteri clade II and VI strains are genetically distinct and their differences affect their functional repertoires and probiotic features. These findings highlight the biological impact of microbe:host coevolution and illustrate the functional significance of subspecies differences in the human microbiome. Consideration of host origin and functional differences at the subspecies level may have major impacts on probiotic strain selection and considerations of microbial ecology in mammalian species. © The Author(s) 2014. Published by Oxford University Press on

  17. p21/Cyclin E pathway modulates anticlastogenic function of Bmi-1 in cancer cells

    Science.gov (United States)

    Deng, Wen; Zhou, Yuan; Tiwari, Agnes FY; Su, Hang; Yang, Jie; Zhu, Dandan; Lau, Victoria Ming Yi; Hau, Pok Man; Yip, Yim Ling; Cheung, Annie LM; Guan, Xin-Yuan; Tsao, Sai Wah

    2015-01-01

    Apart from regulating stem cell self-renewal, embryonic development and proliferation, Bmi-1 has been recently reported to be critical in the maintenance of genome integrity. In searching for novel mechanisms underlying the anticlastogenic function of Bmi-1, we observed, for the first time, that Bmi-1 positively regulates p21 expression. We extended the finding that Bmi-1 deficiency induced chromosome breaks in multiple cancer cell models. Interestingly, we further demonstrated that knockdown of cyclin E or ectopic overexpression of p21 rescued Bmi-1 deficiency-induced chromosome breaks. We therefore conclude that p21/cyclin E pathway is crucial in modulating the anticlastogenic function of Bmi-1. As it is well established that the overexpression of cyclin E potently induces genome instability and p21 suppresses the function of cyclin E, the novel and important implication from our findings is that Bmi-1 plays an important role in limiting genomic instability in cylin E-overexpressing cancer cells by positive regulation of p21. PMID:25131797

  18. Genomic clone encoding the α chain of the OKM1, LFA-1, and platelet glycoprotein IIb-IIIa molecules

    International Nuclear Information System (INIS)

    Cosgrove, L.J.; Sandrin, M.S.; Rajasekariah, P.; McKenzie, I.F.C.

    1986-01-01

    LFA-1, an antigen involved in cytolytic T lymphocyte-mediated killing, and Mac-1, the receptor for complement component C3bi, constitute a family of structurally and functionally related cell surface glycoproteins involved in cellular interactions. In both mouse and man, Mac-1 (OKM1) and LFA-1 share a common 95-kDa β subunit but are distinguished by their α chains, which have different cellular distributions, apparent molecular masses (165 and 177 kDa, respectively), and peptide maps. The authors report the isolation of a genomic clone from a human genomic library that on transfection into mouse fibroblasts produced a molecule(s) reactive with monoclonal antibodies to OKM1, to LFA-1, and to platelet glycoprotein IIb-IIIa. This gene was cloned by several cycles of transfection of L cells with a human genomic library cloned in λ phase Charon 4A and subsequent rescue of the λ phage. Transfection with the purified recombinant λ DNA yielded a transfectant that expressed the three human α chains of OKM1, LFA-1, and glycoprotein IIb-IIIa, presumably in association with the murine β chain

  19. Neocortical development as an evolutionary platform for intragenomic conflict

    Directory of Open Access Journals (Sweden)

    Eric eLewitus

    2013-04-01

    Full Text Available Embryonic development in mammals has evolved a platform for genomic conflict between mothers and embryos and, by extension, between maternal and paternal genomes. The evolutionary interests of the mother and embryo may be maximized through the promotion of sex-chromosome genes and imprinted alleles, resulting in the rapid evolution of postzygotic phenotypes preferential to either the maternal or paternal genome. In eutherian mammals, extraordinary in utero maternal investment in the brain, and neocortex especially, suggests that convergent evolution of an expanded mammalian neocortex along divergent lineages may be explained, in part, by parent-of-origin-linked gene expression arising from parent-offspring conflict. The influence of this conflict on neocortical development and evolution, however, has not been investigated at the genomic level. In this hypothesis and theory article, we provide preliminary evidence for positive selection in humans in the regions of two platforms of intragenomic conflict – chromosomes 15q11-q13 and X – and explore the potential relevance of cis-regulated imprinted domains to neocortical expansion in mammalian evolution. We present the hypothesis that maternal- and paternal-specific pressures on the developing neocortex compete intragenomically to influence neocortical expansion in mammalian evolution.

  20. Parallel processing of genomics data

    Science.gov (United States)

    Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-10-01

    The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.

  1. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    Directory of Open Access Journals (Sweden)

    Mohit Verma

    Full Text Available Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB, which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology search and comparative gene expression analysis. The current release of CTDB (v2.0 hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  2. Functional Associations by Response Overlap (FARO), a functional genomics approach matching gene expression phenotypes

    DEFF Research Database (Denmark)

    Nielsen, Henrik Bjørn; Mundy, J.; Willenbrock, Hanni

    2007-01-01

    The systematic comparison of transcriptional responses of organisms is a powerful tool in functional genomics. For example, mutants may be characterized by comparing their transcript profiles to those obtained in other experiments querying the effects on gene expression of many experimental facto...

  3. A map of human genome variation from population-scale sequencing.

    Science.gov (United States)

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  4. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  5. Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N=112 151)

    Science.gov (United States)

    Davies, G; Marioni, R E; Liewald, D C; Hill, W D; Hagenaars, S P; Harris, S E; Ritchie, S J; Luciano, M; Fawns-Ritchie, C; Lyall, D; Cullen, B; Cox, S R; Hayward, C; Porteous, D J; Evans, J; McIntosh, A M; Gallacher, J; Craddock, N; Pell, J P; Smith, D J; Gale, C R; Deary, I J

    2016-01-01

    People's differences in cognitive functions are partly heritable and are associated with important life outcomes. Previous genome-wide association (GWA) studies of cognitive functions have found evidence for polygenic effects yet, to date, there are few replicated genetic associations. Here we use data from the UK Biobank sample to investigate the genetic contributions to variation in tests of three cognitive functions and in educational attainment. GWA analyses were performed for verbal–numerical reasoning (N=36 035), memory (N=112 067), reaction time (N=111 483) and for the attainment of a college or a university degree (N=111 114). We report genome-wide significant single-nucleotide polymorphism (SNP)-based associations in 20 genomic regions, and significant gene-based findings in 46 regions. These include findings in the ATXN2, CYP2DG, APBA1 and CADM2 genes. We report replication of these hits in published GWA studies of cognitive function, educational attainment and childhood intelligence. There is also replication, in UK Biobank, of SNP hits reported previously in GWA studies of educational attainment and cognitive function. GCTA-GREML analyses, using common SNPs (minor allele frequency>0.01), indicated significant SNP-based heritabilities of 31% (s.e.m.=1.8%) for verbal–numerical reasoning, 5% (s.e.m.=0.6%) for memory, 11% (s.e.m.=0.6%) for reaction time and 21% (s.e.m.=0.6%) for educational attainment. Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort. The genomic regions identified include several novel loci, some of which have been associated with intracranial volume, neurodegeneration, Alzheimer's disease and schizophrenia. PMID:27046643

  6. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    Science.gov (United States)

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  7. OpenADAM: an open source genome-wide association data management system for Affymetrix SNP arrays

    Directory of Open Access Journals (Sweden)

    Sham P C

    2008-12-01

    Full Text Available Abstract Background Large scale genome-wide association studies have become popular since the introduction of high throughput genotyping platforms. Efficient management of the vast array of data generated poses many challenges. Description We have developed an open source web-based data management system for the large amount of genotype data generated from the Affymetrix GeneChip® Mapping Array and Affymetrix Genome-Wide Human SNP Array platforms. The database supports genotype calling using DM, BRLMM, BRLMM-P or Birdseed algorithms provided by the Affymetrix Power Tools. The genotype and corresponding pedigree data are stored in a relational database for efficient downstream data manipulation and analysis, such as calculation of allele and genotype frequencies, sample identity checking, and export of genotype data in various file formats for analysis using commonly-available software. A novel method for genotyping error estimation is implemented using linkage disequilibrium information from the HapMap project. All functionalities are accessible via a web-based user interface. Conclusion OpenADAM provides an open source database system for management of Affymetrix genome-wide association SNP data.

  8. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    KAUST Repository

    Neave, Matthew J.

    2017-01-17

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts.

  9. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    KAUST Repository

    Neave, Matthew J.; Michell, Craig; Apprill, Amy; Voolstra, Christian R.

    2017-01-01

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts.

  10. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  11. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  12. A microcontroller platform for the rapid prototyping of functional electrical stimulation-based gait neuroprostheses.

    Science.gov (United States)

    Luzio de Melo, Paulo; da Silva, Miguel Tavares; Martins, Jorge; Newman, Dava

    2015-05-01

    Functional electrical stimulation (FES) has been used over the last decades as a method to rehabilitate lost motor functions of individuals with spinal cord injury, multiple sclerosis, and post-stroke hemiparesis. Within this field, researchers in need of developing FES-based control solutions for specific disabilities often have to choose between either the acquisition and integration of high-performance industry-level systems, which are rather expensive and hardly portable, or develop custom-made portable solutions, which despite their lower cost, usually require expert-level electronic skills. Here, a flexible low-cost microcontroller-based platform for rapid prototyping of FES neuroprostheses is presented, designed for reduced execution complexity, development time, and production cost. For this reason, the Arduino open-source microcontroller platform was used, together with off-the-shelf components whenever possible. The developed system enables the rapid deployment of portable FES-based gait neuroprostheses, being flexible enough to allow simple open-loop strategies but also more complex closed-loop solutions. The system is based on a modular architecture that allows the development of optimized solutions depending on the desired FES applications, even though the design and testing of the platform were focused toward drop foot correction. The flexibility of the system was demonstrated using two algorithms targeting drop foot condition within different experimental setups. Successful bench testing of the device in healthy subjects demonstrated these neuroprosthesis platform capabilities to correct drop foot. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.

  13. Functional assessment of human enhancer activities using whole-genome STARR-sequencing.

    Science.gov (United States)

    Liu, Yuwen; Yu, Shan; Dhiman, Vineet K; Brunetti, Tonya; Eckart, Heather; White, Kevin P

    2017-11-20

    Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome.  In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.

  14. Bluejay 1.0: genome browsing and comparison with rich customization provision and dynamic resource linking

    Directory of Open Access Journals (Sweden)

    Turinsky Andrei L

    2008-10-01

    Full Text Available Abstract Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i gene expression information; and (ii comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i numerous display customization features; (ii the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes.

  15. Functional Genomics Approaches to Studying Symbioses between Legumes and Nitrogen-Fixing Rhizobia.

    Science.gov (United States)

    Lardi, Martina; Pessi, Gabriella

    2018-05-18

    Biological nitrogen fixation gives legumes a pronounced growth advantage in nitrogen-deprived soils and is of considerable ecological and economic interest. In exchange for reduced atmospheric nitrogen, typically given to the plant in the form of amides or ureides, the legume provides nitrogen-fixing rhizobia with nutrients and highly specialised root structures called nodules. To elucidate the molecular basis underlying physiological adaptations on a genome-wide scale, functional genomics approaches, such as transcriptomics, proteomics, and metabolomics, have been used. This review presents an overview of the different functional genomics approaches that have been performed on rhizobial symbiosis, with a focus on studies investigating the molecular mechanisms used by the bacterial partner to interact with the legume. While rhizobia belonging to the alpha-proteobacterial group (alpha-rhizobia) have been well studied, few studies to date have investigated this process in beta-proteobacteria (beta-rhizobia).

  16. CoryneCenter – An online resource for the integrated analysis of corynebacterial genome and transcriptome data

    Directory of Open Access Journals (Sweden)

    Hüser Andrea T

    2007-11-01

    Full Text Available Abstract Background The introduction of high-throughput genome sequencing and post-genome analysis technologies, e.g. DNA microarray approaches, has created the potential to unravel and scrutinize complex gene-regulatory networks on a large scale. The discovery of transcriptional regulatory interactions has become a major topic in modern functional genomics. Results To facilitate the analysis of gene-regulatory networks, we have developed CoryneCenter, a web-based resource for the systematic integration and analysis of genome, transcriptome, and gene regulatory information for prokaryotes, especially corynebacteria. For this purpose, we extended and combined the following systems into a common platform: (1 GenDB, an open source genome annotation system, (2 EMMA, a MAGE compliant application for high-throughput transcriptome data storage and analysis, and (3 CoryneRegNet, an ontology-based data warehouse designed to facilitate the reconstruction and analysis of gene regulatory interactions. We demonstrate the potential of CoryneCenter by means of an application example. Using microarray hybridization data, we compare the gene expression of Corynebacterium glutamicum under acetate and glucose feeding conditions: Known regulatory networks are confirmed, but moreover CoryneCenter points out additional regulatory interactions. Conclusion CoryneCenter provides more than the sum of its parts. Its novel analysis and visualization features significantly simplify the process of obtaining new biological insights into complex regulatory systems. Although the platform currently focusses on corynebacteria, the integrated tools are by no means restricted to these species, and the presented approach offers a general strategy for the analysis and verification of gene regulatory networks. CoryneCenter provides freely accessible projects with the underlying genome annotation, gene expression, and gene regulation data. The system is publicly available at http://www.CoryneCenter.de.

  17. Single Amplified Genomes as Source for Novel Extremozymes: Annotation, Expression and Functional Assessment

    KAUST Repository

    Grötzinger, Stefan

    2017-12-01

    Enzymes, as nature’s catalysts, show remarkable abilities that can revolutionize the chemical, biotechnological, bioremediation, agricultural and pharmaceutical industries. However, the narrow range of stability of the majority of described biocatalysts limits their use for many applications. To overcome these restrictions, extremozymes derived from microorganisms thriving under harsh conditions can be used. Extremophiles living in high salinity are especially interesting as they operate at low water activity, which is similar to conditions used in standard chemical applications. Because only about 0.1 % of all microorganisms can be cultured, the traditional way of culture-based enzyme function determination needs to be overcome. The rise of high-throughput next-generation-sequencing technologies allows for deep insight into nature’s variety. Single amplified genomes (SAGs) specifically allow for whole genome assemblies from small sample volumes with low cell yields, as are typical for extreme environments. Although these technologies have been available for years, the expected boost in biotechnology has held off. One of the main reasons is the lack of reliable functional annotation of the genomic data, which is caused by the low amount (0.15 %) of experimentally described genes. Here, we present a novel annotation algorithm, designed to annotate the enzymatic function of genomes from microorganisms with low homologies to described microorganisms. The algorithm was established on SAGs from the extreme environment of selected hypersaline Red Sea brine pools with 4.3 M salinity and temperatures up to 68°C. Additionally, a novel consensus pattern for the identification of γ-carbonic anhydrases was created and applied in the algorithm. To verify the annotation, selected genes were expressed in the hypersaline expression system Halobacterium salinarum. This expression system was established and optimized in a continuously stirred tank reactor, leading to

  18. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    Science.gov (United States)

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  19. The Arabidopsis Rho of Plants GTPase AtROP6 Functions in Developmental and Pathogen Response Pathways1[C][W][OA

    Science.gov (United States)

    Poraty-Gavra, Limor; Zimmermann, Philip; Haigis, Sabine; Bednarek, Paweł; Hazak, Ora; Stelmakh, Oksana Rogovoy; Sadot, Einat; Schulze-Lefert, Paul; Gruissem, Wilhelm; Yalovsky, Shaul

    2013-01-01

    How plants coordinate developmental processes and environmental stress responses is a pressing question. Here, we show that Arabidopsis (Arabidopsis thaliana) Rho of Plants6 (AtROP6) integrates developmental and pathogen response signaling. AtROP6 expression is induced by auxin and detected in the root meristem, lateral root initials, and leaf hydathodes. Plants expressing a dominant negative AtROP6 (rop6DN) under the regulation of its endogenous promoter are small and have multiple inflorescence stems, twisted leaves, deformed leaf epidermis pavement cells, and differentially organized cytoskeleton. Microarray analyses of rop6DN plants revealed that major changes in gene expression are associated with constitutive salicylic acid (SA)-mediated defense responses. In agreement, their free and total SA levels resembled those of wild-type plants inoculated with a virulent powdery mildew pathogen. The constitutive SA-associated response in rop6DN was suppressed in mutant backgrounds defective in SA signaling (nonexpresser of PR genes1 [npr1]) or biosynthesis (salicylic acid induction deficient2 [sid2]). However, the rop6DN npr1 and rop6DN sid2 double mutants retained the aberrant developmental phenotypes, indicating that the constitutive SA response can be uncoupled from ROP function(s) in development. rop6DN plants exhibited enhanced preinvasive defense responses to a host-adapted virulent powdery mildew fungus but were impaired in preinvasive defenses upon inoculation with a nonadapted powdery mildew. The host-adapted powdery mildew had a reduced reproductive fitness on rop6DN plants, which was retained in mutant backgrounds defective in SA biosynthesis or signaling. Our findings indicate that both the morphological aberrations and altered sensitivity to powdery mildews of rop6DN plants result from perturbations that are independent from the SA-associated response. These perturbations uncouple SA-dependent defense signaling from disease resistance execution. PMID

  20. Type 1 diabetes genome-wide association studies

    DEFF Research Database (Denmark)

    Pociot, Flemming

    2017-01-01

    Genetic studies have identified >60 loci associated with the risk of developing type 1 diabetes (T1D). The vast majority of these are identified by genome-wide association studies (GWAS) using large case-control cohorts of European ancestry. More than 80% of the heritability of T1D can be explained...... by GWAS data in this population group. However, with few exceptions, their individual contribution to T1D risk is low and understanding their function in disease biology remains a huge challenge. GWAS on its own does not inform us in detail on disease mechanisms, but the combination of GWAS data...... with other omics-data is beginning to advance our understanding of T1D etiology and pathogenesis. Current knowledge supports the notion that genetic variation in both pancreatic β cells and in immune cells is central in mediating T1D risk. Advances, perspectives and limitations of GWAS are discussed...

  1. Functional validation of candidate genes detected by genomic feature models

    DEFF Research Database (Denmark)

    Rohde, Palle Duun; Østergaard, Solveig; Kristensen, Torsten Nygaard

    2018-01-01

    to investigate locomotor activity, and applied genomic feature prediction models to identify gene ontology (GO) cate- gories predictive of this phenotype. Next, we applied the covariance association test to partition the genomic variance of the predictive GO terms to the genes within these terms. We...... then functionally assessed whether the identified candidate genes affected locomotor activity by reducing gene expression using RNA interference. In five of the seven candidate genes tested, reduced gene expression altered the phenotype. The ranking of genes within the predictive GO term was highly correlated......Understanding the genetic underpinnings of complex traits requires knowledge of the genetic variants that contribute to phenotypic variability. Reliable statistical approaches are needed to obtain such knowledge. In genome-wide association studies, variants are tested for association with trait...

  2. An Ontology-Based GIS for Genomic Data Management of Rumen Microbes.

    Science.gov (United States)

    Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Minuchehr, Zarrin; Nassiri, Mohammad Reza

    2015-03-01

    During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.

  3. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  4. The eukaryotic genome is structurally and functionally more like a social insect colony than a book.

    Science.gov (United States)

    Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin

    2017-11-01

    Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.

  5. Gene clusters of Hafnia alvei strain FB1 important in survival and pathogenesis: a draft genome perspective.

    Science.gov (United States)

    Tan, Jia-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2014-01-01

    Hafnia alvei is an opportunistic pathogen involved in various types of nosocomical infections. The species has been found to inhabit food and mammalian guts. However, its status as an enteropathogen, and whether the food-inhabiting strains could be a source of gastrointestinal infection remains obscure. In this report we present a draft genome of H. alvei strain FB1 isolated from fish paste meatball, a food popular among Malaysian and Chinese populations. The data was generated on the Illumina MiSeq platform. A comparative study was carried out on FB1 against two other previously sequenced H. alvei genomes. Several gene clusters putatively involved in survival and pathogenesis of H. alvei FB1 in food and gut environment were characterised in this study. These include the widespread colonisation island (WCI), the tad locus that is known to play an essential role in biofilm formation, a eut operon that might contribute to advantage in nutrient acquisition in gut environment, and genes responsible for siderophore production This features enable the bacteria to successful colonise in the host gut environment. With the whole genome data of H. alvei FB1 presented in this study, we hope to provide an insight into future studies on this candidate of enteropathogen by looking into the possible mechanisms employed to survive stresses and gain advantage in competitions, which eventually leads to successful colonisation and pathogenesis. This is to serve as the basis for more effective clinical diagnosis and treatment.

  6. Genetic recombination is directed away from functional genomic elements in mice.

    Science.gov (United States)

    Brick, Kevin; Smagulova, Fatima; Khil, Pavel; Camerini-Otero, R Daniel; Petukhova, Galina V

    2012-05-13

    Genetic recombination occurs during meiosis, the key developmental programme of gametogenesis. Recombination in mammals has been recently linked to the activity of a histone H3 methyltransferase, PR domain containing 9 (PRDM9), the product of the only known speciation-associated gene in mammals. PRDM9 is thought to determine the preferred recombination sites--recombination hotspots--through sequence-specific binding of its highly polymorphic multi-Zn-finger domain. Nevertheless, Prdm9 knockout mice are proficient at initiating recombination. Here we map and analyse the genome-wide distribution of recombination initiation sites in Prdm9 knockout mice and in two mouse strains with different Prdm9 alleles and their F(1) hybrid. We show that PRDM9 determines the positions of practically all hotspots in the mouse genome, with the exception of the pseudo-autosomal region (PAR)--the only area of the genome that undergoes recombination in 100% of cells. Surprisingly, hotspots are still observed in Prdm9 knockout mice, and as in wild type, these hotspots are found at H3 lysine 4 (H3K4) trimethylation marks. However, in the absence of PRDM9, most recombination is initiated at promoters and at other sites of PRDM9-independent H3K4 trimethylation. Such sites are rarely targeted in wild-type mice, indicating an unexpected role of the PRDM9 protein in sequestering the recombination machinery away from gene-promoter regions and other functional genomic elements.

  7. Towards an environment interface standard for agent platforms

    OpenAIRE

    Behrens, T.M.; Hindriks, K.V.; Dix, J.

    2010-01-01

    We introduce an interface for connecting agent platforms to environments. This interface provides generic functionality for executing actions and for perceiving changes in an agent’s environment. It also provides support for managing an environment, e.g., for starting, pausing and terminating it. Among the benefits of such an interface are (1) standard functionality is provided by the interface implementation itself, and (2) agent platforms that support the interface can connect to any enviro...

  8. Vitamin D and the brain: Genomic and non-genomic actions.

    Science.gov (United States)

    Cui, Xiaoying; Gooch, Helen; Petty, Alice; McGrath, John J; Eyles, Darryl

    2017-09-15

    1,25(OH) 2 D 3 (vitamin D) is well-recognized as a neurosteroid that modulates multiple brain functions. A growing body of evidence indicates that vitamin D plays a pivotal role in brain development, neurotransmission, neuroprotection and immunomodulation. However, the precise molecular mechanisms by which vitamin D exerts these functions in the brain are still unclear. Vitamin D signalling occurs via the vitamin D receptor (VDR), a zinc-finger protein in the nuclear receptor superfamily. Like other nuclear steroids, vitamin D has both genomic and non-genomic actions. The transcriptional activity of vitamin D occurs via the nuclear VDR. Its faster, non-genomic actions can occur when the VDR is distributed outside the nucleus. The VDR is present in the developing and adult brain where it mediates the effects of vitamin D on brain development and function. The purpose of this review is to summarise the in vitro and in vivo work that has been conducted to characterise the genomic and non-genomic actions of vitamin D in the brain. Additionally we link these processes to functional neurochemical and behavioural outcomes. Elucidation of the precise molecular mechanisms underpinning vitamin D signalling in the brain may prove useful in understanding the role this steroid plays in brain ontogeny and function. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Whole genome complete resequencing of Bacillus subtilis natto by combining long reads with high-quality short reads.

    Directory of Open Access Journals (Sweden)

    Mayumi Kamada

    Full Text Available De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food "natto." The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome.

  10. Introns: The Functional Benefits of Introns in Genomes

    Directory of Open Access Journals (Sweden)

    Bong-Seok Jo

    2015-12-01

    Full Text Available The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced.

  11. Introns: The Functional Benefits of Introns in Genomes.

    Science.gov (United States)

    Jo, Bong-Seok; Choi, Sun Shim

    2015-12-01

    The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced.

  12. SBRC-Nottingham: sustainable routes to platform chemicals from C1 waste gases.

    Science.gov (United States)

    Burbidge, Alan; Minton, Nigel P

    2016-06-15

    Synthetic Biology Research Centre (SBRC)-Nottingham (www.sbrc-nottingham.ac.uk) was one of the first three U.K. university-based SBRCs to be funded by the Biotechnology and Biological Sciences Research Council (BBSRC) and Engineering and Physical Sciences Research Council (EPSRC) as part of the recommendations made in the U.K.'s Synthetic Biology Roadmap. It was established in 2014 and builds on the pioneering work of the Clostridia Research Group (CRG) who have previously developed a range of gene tools for the modification of clostridial genomes. The SBRC is primarily focussed on the conversion of single carbon waste gases into platform chemicals with a particular emphasis on the use of the aerobic chassis Cupriavidus necator. © 2016 The Author(s).

  13. Genome sequence of the Bacteroides fragilis phage ATCC 51477-B1

    Directory of Open Access Journals (Sweden)

    Hawkins Shawn A

    2008-08-01

    Full Text Available Abstract The genome of a fecal pollution indicator phage, Bacteroides fragilis ATCC 51477-B1, was sequenced and consisted of 44,929 bases with a G+C content of 38.7%. Forty-six putative open reading frames were identified and genes were organized into functional clusters for host specificity, lysis, replication and regulation, and packaging and structural proteins.

  14. Genome 3D-architecture: Its plasticity in relation to function

    Indian Academy of Sciences (India)

    Kundan Sengupta

    Mini-Review. Genome 3D-architecture: Its plasticity in relation to function. KUNDAN ... MS received 23 October 2017; accepted 14 February 2018; published online 7 April 2018 .... moter Communication and T Cell Fate. Cell 171 103–119.

  15. Snf2 family gene distribution in higher plant genomes reveals DRD1 expansion and diversification in the tomato genome.

    Science.gov (United States)

    Bargsten, Joachim W; Folta, Adam; Mlynárová, Ludmila; Nap, Jan-Peter

    2013-01-01

    As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. This analysis is the first comprehensive study of Snf2 family ATPases in plants. We here present a comparative analysis of 1159 candidate plant Snf2 genes in 33 complete and annotated plant genomes, including two green algae. The number of Snf2 ATPases shows considerable variation across plant genomes (17-63 genes). The DRD1, Rad5/16 and Snf2 subfamily members occur most often. Detailed analysis of the plant-specific DRD1 subfamily in related plant genomes shows the occurrence of a complex series of evolutionary events. Notably tomato carries unexpected gene expansions of DRD1 gene members. Most of these genes are expressed in tomato, although at low levels and with distinct tissue or organ specificity. In contrast, the Snf2 subfamily genes tend to be expressed constitutively in tomato. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding.

  16. The mitochondrial genome of an aquatic plant, Spirodela polyrhiza.

    Directory of Open Access Journals (Sweden)

    Wenqin Wang

    Full Text Available BACKGROUND: Spirodela polyrhiza is a species of the order Alismatales, which represent the basal lineage of monocots with more ancestral features than the Poales. Its complete sequence of the mitochondrial (mt genome could provide clues for the understanding of the evolution of mt genomes in plant. METHODS: Spirodela polyrhiza mt genome was sequenced from total genomic DNA without physical separation of chloroplast and nuclear DNA using the SOLiD platform. Using a genome copy number sensitive assembly algorithm, the mt genome was successfully assembled. Gap closure and accuracy was determined with PCR products sequenced with the dideoxy method. CONCLUSIONS: This is the most compact monocot mitochondrial genome with 228,493 bp. A total of 57 genes encode 35 known proteins, 3 ribosomal RNAs, and 19 tRNAs that recognize 15 amino acids. There are about 600 RNA editing sites predicted and three lineage specific protein-coding-gene losses. The mitochondrial genes, pseudogenes, and other hypothetical genes (ORFs cover 71,783 bp (31.0% of the genome. Imported plastid DNA accounts for an additional 9,295 bp (4.1% of the mitochondrial DNA. Absence of transposable element sequences suggests that very few nuclear sequences have migrated into Spirodela mtDNA. Phylogenetic analysis of conserved protein-coding genes suggests that Spirodela shares the common ancestor with other monocots, but there is no obvious synteny between Spirodela and rice mtDNAs. After eliminating genes, introns, ORFs, and plastid-derived DNA, nearly four-fifths of the Spirodela mitochondrial genome is of unknown origin and function. Although it contains a similar chloroplast DNA content and range of RNA editing as other monocots, it is void of nuclear insertions, active gene loss, and comprises large regions of sequences of unknown origin in non-coding regions. Moreover, the lack of synteny with known mitochondrial genomic sequences shed new light on the early evolution of monocot

  17. Genome sequence of Ensifer adhaerens OV14 provides insights into its ability as a novel vector for the genetic transformation of plant genomes.

    Science.gov (United States)

    Rudder, Steven; Doohan, Fiona; Creevey, Christopher J; Wendt, Toni; Mullins, Ewen

    2014-04-07

    Recently it has been shown that Ensifer adhaerens can be used as a plant transformation technology, transferring genes into several plant genomes when equipped with a Ti plasmid. For this study, we have sequenced the genome of Ensifer adhaerens OV14 (OV14) and compared it with those of Agrobacterium tumefaciens C58 (C58) and Sinorhizobium meliloti 1021 (1021); the latter of which has also demonstrated a capacity to genetically transform crop genomes, albeit at significantly reduced frequencies. The 7.7 Mb OV14 genome comprises two chromosomes and two plasmids. All protein coding regions in the OV14 genome were functionally grouped based on an eggNOG database. No genes homologous to the A. tumefaciens Ti plasmid vir genes appeared to be present in the OV14 genome. Unexpectedly, OV14 and 1021 were found to possess homologs to chromosomal based genes cited as essential to A. tumefaciens T-DNA transfer. Of significance, genes that are non-essential but exert a positive influence on virulence and the ability to genetically transform host genomes were identified in OV14 but were absent from the 1021 genome. This study reveals the presence of homologs to chromosomally based Agrobacterium genes that support T-DNA transfer within the genome of OV14 and other alphaproteobacteria. The sequencing and analysis of the OV14 genome increases our understanding of T-DNA transfer by non-Agrobacterium species and creates a platform for the continued improvement of Ensifer-mediated transformation (EMT).

  18. Minos as a novel Tc1/mariner-type transposable element for functional genomic analysis in Aspergillus nidulans.

    Science.gov (United States)

    Evangelinos, Minoas; Anagnostopoulos, Gerasimos; Karvela-Kalogeraki, Iliana; Stathopoulou, Panagiota M; Scazzocchio, Claudio; Diallinas, George

    2015-08-01

    Transposons constitute powerful genetic tools for gene inactivation, exon or promoter trapping and genome analyses. The Minos element from Drosophila hydei, a Tc1/mariner-like transposon, has proved as a very efficient tool for heterologous transposition in several metazoa. In filamentous fungi, only a handful of fungal-specific transposable elements have been exploited as genetic tools, with the impala Tc1/mariner element from Fusarium oxysporum being the most successful. Here, we developed a two-component transposition system to manipulate Minos transposition in Aspergillus nidulans (AnMinos). Our system allows direct selection of transposition events based on re-activation of niaD, a gene necessary for growth on nitrate as a nitrogen source. On average, among 10(8) conidiospores, we obtain up to ∼0.8×10(2) transposition events leading to the expected revertant phenotype (niaD(+)), while ∼16% of excision events lead to AnMinos loss. Characterized excision footprints consisted of the four terminal bases of the transposon flanked by the TA target duplication and led to no major DNA rearrangements. AnMinos transposition depends on the presence of its homologous transposase. Its frequency was not significantly affected by temperature, UV irradiation or the transcription status of the original integration locus (niaD). Importantly, transposition is dependent on nkuA, encoding an enzyme essential for non-homologous end joining of DNA in double-strand break repair. AnMinos proved to be an efficient tool for functional analysis as it seems to transpose in different genomic loci positions in all chromosomes, including a high proportion of integration events within or close to genes. We have used Minos to obtain morphological and toxic analogue resistant mutants. Interestingly, among morphological mutants some seem to be due to Minos-elicited over-expression of specific genes, rather than gene inactivation. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. Cross-Platform Technologies

    Directory of Open Access Journals (Sweden)

    Maria Cristina ENACHE

    2017-04-01

    Full Text Available Cross-platform - a concept becoming increasingly used in recent years especially in the development of mobile apps, but this consistently over time and in the development of conventional desktop applications. The notion of cross-platform software (multi-platform or platform-independent refers to a software application that can run on more than one operating system or computing architecture. Thus, a cross-platform application can operate independent of software or hardware platform on which it is execute. As a generic definition presents a wide range of meanings for purposes of this paper we individualize this definition as follows: we will reduce the horizon of meaning and we use functionally following definition: a cross-platform application is a software application that can run on more than one operating system (desktop or mobile identical or in a similar way.

  20. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  1. GWATCH: a web platform for automated gene association discovery analysis

    Science.gov (United States)

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  2. GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

    Directory of Open Access Journals (Sweden)

    Promponas Vasilis J

    2003-10-01

    Full Text Available Abstract Background The availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies. Results GeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii. Conclusions GeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating

  3. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  4. Germplasm Management in the Post-genomics Era-a case study with lettuce

    Science.gov (United States)

    High-throughput genotyping platforms and next-generation sequencing technologies revolutionized our ways in germplasm characterization. In collaboration with UC Davis Genome Center, we completed a project of genotyping the entire cultivated lettuce (Lactuca sativa L.) collection of 1,066 accessions ...

  5. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    Science.gov (United States)

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  6. Functional Annotation of All Salmonid Genomes (FAASG): an international initiative supporting future salmonid research, conservation and aquaculture.

    Science.gov (United States)

    Macqueen, Daniel J; Primmer, Craig R; Houston, Ross D; Nowak, Barbara F; Bernatchez, Louis; Bergseth, Steinar; Davidson, William S; Gallardo-Escárate, Cristian; Goldammer, Tom; Guiguen, Yann; Iturra, Patricia; Kijas, James W; Koop, Ben F; Lien, Sigbjørn; Maass, Alejandro; Martin, Samuel A M; McGinnity, Philip; Montecino, Martin; Naish, Kerry A; Nichols, Krista M; Ólafsson, Kristinn; Omholt, Stig W; Palti, Yniv; Plastow, Graham S; Rexroad, Caird E; Rise, Matthew L; Ritchie, Rachael J; Sandve, Simen R; Schulte, Patricia M; Tello, Alfredo; Vidal, Rodrigo; Vik, Jon Olav; Wargelius, Anna; Yáñez, José Manuel

    2017-06-27

    We describe an emerging initiative - the 'Functional Annotation of All Salmonid Genomes' (FAASG), which will leverage the extensive trait diversity that has evolved since a whole genome duplication event in the salmonid ancestor, to develop an integrative understanding of the functional genomic basis of phenotypic variation. The outcomes of FAASG will have diverse applications, ranging from improved understanding of genome evolution, to improving the efficiency and sustainability of aquaculture production, supporting the future of fundamental and applied research in an iconic fish lineage of major societal importance.

  7. Advanced Whole-Genome Sequencing and Analysis of Fetal Genomes from Amniotic Fluid.

    Science.gov (United States)

    Mao, Qing; Chin, Robert; Xie, Weiwei; Deng, Yuqing; Zhang, Wenwei; Xu, Huixin; Zhang, Rebecca Yu; Shi, Quan; Peters, Erin E; Gulbahce, Natali; Li, Zhenyu; Chen, Fang; Drmanac, Radoje; Peters, Brock A

    2018-04-01

    Amniocentesis is a common procedure, the primary purpose of which is to collect cells from the fetus to allow testing for abnormal chromosomes, altered chromosomal copy number, or a small number of genes that have small single- to multibase defects. Here we demonstrate the feasibility of generating an accurate whole-genome sequence of a fetus from either the cellular or cell-free DNA (cfDNA) of an amniotic sample. cfDNA and DNA isolated from the cell pellet of 31 amniocenteses were sequenced to approximately 50× genome coverage by use of the Complete Genomics nanoarray platform. In a subset of the samples, long fragment read libraries were generated from DNA isolated from cells and sequenced to approximately 100× genome coverage. Concordance of variant calls between the 2 DNA sources and with parental libraries was >96%. Two fetal genomes were found to harbor potentially detrimental variants in chromodomain helicase DNA binding protein 8 ( CHD8 ) and LDL receptor-related protein 1 ( LRP1 ), variations of which have been associated with autism spectrum disorder and keratosis pilaris atrophicans, respectively. We also discovered drug sensitivities and carrier information of fetuses for a variety of diseases. We were able to elucidate the complete genome sequence of 31 fetuses from amniotic fluid and demonstrate that the cfDNA or DNA from the cell pellet can be analyzed with little difference in quality. We believe that current technologies could analyze this material in a highly accurate and complete manner and that analyses like these should be considered for addition to current amniocentesis procedures. © 2018 American Association for Clinical Chemistry.

  8. Visualization, analysis, and design of COMBO-FISH probes in the grid-based GLOBE 3D genome platform

    NARCIS (Netherlands)

    F.N. Kepper (Nick); E. Schmitt (Eberhard); M. Lesnussa (Michael); Y. Weiland (Yanina); H.J.F.M.M. Eussen (Bert); F.G. Grosveld (Frank); M. Hausmann (Michael); T.A. Knoch (Tobias)

    2010-01-01

    textabstractThe genome architecture in cell nuclei plays an important role in modern microscopy for the monitoring of medical diagnosis and therapy since changes of function and dynamics of genes are interlinked with changing geometrical parameters. The planning of corresponding diagnostic

  9. A framework for annotating human genome in disease context.

    Science.gov (United States)

    Xu, Wei; Wang, Huisong; Cheng, Wenqing; Fu, Dong; Xia, Tian; Kibbe, Warren A; Lin, Simon M

    2012-01-01

    Identification of gene-disease association is crucial to understanding disease mechanism. A rapid increase in biomedical literatures, led by advances of genome-scale technologies, poses challenge for manually-curated-based annotation databases to characterize gene-disease associations effectively and timely. We propose an automatic method-The Disease Ontology Annotation Framework (DOAF) to provide a comprehensive annotation of the human genome using the computable Disease Ontology (DO), the NCBO Annotator service and NCBI Gene Reference Into Function (GeneRIF). DOAF can keep the resulting knowledgebase current by periodically executing automatic pipeline to re-annotate the human genome using the latest DO and GeneRIF releases at any frequency such as daily or monthly. Further, DOAF provides a computable and programmable environment which enables large-scale and integrative analysis by working with external analytic software or online service platforms. A user-friendly web interface (doa.nubic.northwestern.edu) is implemented to allow users to efficiently query, download, and view disease annotations and the underlying evidences.

  10. Functional annotation from the genome sequence of the giant panda

    OpenAIRE

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-01-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided in...

  11. SCK-CEN Genomic Platform: the microarray technology

    International Nuclear Information System (INIS)

    Benotmane, R.

    2006-01-01

    The human body contains approximately 10 14 cells, wherein each one is a nucleus. The nucleus contains 2x23 chromosomes, or two complete sets of the human genome, one set coming from the mother and the other from the father. In principle each set includes 30.000-40.000 genes. If the genome was a book, it would be twenty-three chapters, called chromosomes,each chapter with several thousand stories, called genes. Each story made up of paragraphs, called exons and introns. Each paragraph made up of 3 letter words, called codons. Each word is written with letters called bases (AGCT). But the whole is written in a single very long sentence, which is the DNA molecule or deoxy nucleic acid. The usual state of DNA is two complementary strands intertwined forming a double helix. In the cell, DNA is duplicated during each cell division to ensure the transmission of the genome to the daughter cells. For expression, the DNA is transcribed to messenger RNA. The RNA is edited and finally translated to a protein, each three bases coding for one amino acid. When the whole message is translated, the chain of amino acids folds itself up into a distinctive shape that depends on its sequence. Proteins are the effectors of the genes, and are responsible for all metabolic, hormonal and enzymatic reactions in the cells. The expressed RNA determines the amount of proteins to be produced and subsequently the desired effect (strong or weak) in the cell. The microarray technology aims at quantifying the amount of RNA present in the cell from each expressed gene, and at evaluating the changes of these amounts after exposure of the cell to toxic chemicals, ionising radiation or other stress components. The global picture of expressed genes helps to understand the affected genetic pathways in the cell at the molecular level. The microarray technology is used in the Radiobiology and Microbiology topics to study the effect of ionising radiation on human cells and mouse tissue, as well as the

  12. wANNOVAR: annotating genetic variants for personal genomes via the web.

    Science.gov (United States)

    Chang, Xiao; Wang, Kai

    2012-07-01

    High-throughput DNA sequencing platforms have become widely available. As a result, personal genomes are increasingly being sequenced in research and clinical settings. However, the resulting massive amounts of variants data pose significant challenges to the average biologists and clinicians without bioinformatics skills. We developed a web server called wANNOVAR to address the critical needs for functional annotation of genetic variants from personal genomes. The server provides simple and intuitive interface to help users determine the functional significance of variants. These include annotating single nucleotide variants and insertions/deletions for their effects on genes, reporting their conservation levels (such as PhyloP and GERP++ scores), calculating their predicted functional importance scores (such as SIFT and PolyPhen scores), retrieving allele frequencies in public databases (such as the 1000 Genomes Project and NHLBI-ESP 5400 exomes), and implementing a 'variants reduction' protocol to identify a subset of potentially deleterious variants/genes. We illustrated how wANNOVAR can help draw biological insights from sequencing data, by analysing genetic variants generated on two Mendelian diseases. We conclude that wANNOVAR will help biologists and clinicians take advantage of the personal genome information to expedite scientific discoveries. The wANNOVAR server is available at http://wannovar.usc.edu, and will be continuously updated to reflect the latest annotation information.

  13. The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1.

    Science.gov (United States)

    Vaz, Paola K; Mahony, Timothy J; Hartley, Carol A; Fowler, Elizabeth V; Ficorilli, Nino; Lee, Sang W; Gilkerson, James R; Browning, Glenn F; Devlin, Joanne M

    2016-01-22

    While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexviruses, sharing 67% nucleotide sequence identity with Human herpesviruses 1 and 2. The MaHV-1 genome contained 66 predicted open reading frames (ORFs) homologous to those in other herpesvirus genomes, but lacked homologues of UL3, UL4, UL56 and glycoprotein J. This is the first alphaherpesvirus genome that has been found to lack the UL3 and UL4 homologues. We identified six novel ORFs and confirmed their transcription by RT-PCR. This is the first genome sequence of a herpesvirus that infects metatherians, a taxonomically unique mammalian clade. Members of the Simplexvirus genus are remarkably conserved, so the absence of ORFs otherwise retained in eutherian and avian alphaherpesviruses contributes to our understanding of the Alphaherpesvirinae. Further study of metatherian herpesvirus genetics and pathogenesis provides a unique approach to understanding herpesvirus-mammalian interactions.

  14. THE NEED FOR FUNCTION PLATFORMS IN ENGINEER TO ORDER INDUSTRIES

    NARCIS (Netherlands)

    Alblas, Alex; Wortmann, Hans; Bergendahl, MN; Grimheden, M; Leifer, L; Skogstad, P; Cantamessa, M

    2009-01-01

    Many industries base their innovations on product platforms. Platforms have predefined modularity with standardized interfaces. However, product platforms provide significant challenges to engineering industries that rely heavily on R&D, such as microlithography systems. Developing these systems

  15. Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Leekitcharoenphon, Pimlapas; Aarestrup, Frank Møller

    2014-01-01

    technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial...

  16. Functional genomics of lipid metabolism in the oleaginous yeast Rhodosporidium toruloides

    Science.gov (United States)

    Geiselman, Gina M; Ito, Masakazu; Mondo, Stephen J; Reilly, Morgann C; Cheng, Ya-Fang; Bauer, Stefan; Grigoriev, Igor V; Gladden, John M; Simmons, Blake A; Brem, Rachel B

    2018-01-01

    The basidiomycete yeast Rhodosporidium toruloides (also known as Rhodotorula toruloides) accumulates high concentrations of lipids and carotenoids from diverse carbon sources. It has great potential as a model for the cellular biology of lipid droplets and for sustainable chemical production. We developed a method for high-throughput genetics (RB-TDNAseq), using sequence-barcoded Agrobacterium tumefaciens T-DNA insertions. We identified 1,337 putative essential genes with low T-DNA insertion rates. We functionally profiled genes required for fatty acid catabolism and lipid accumulation, validating results with 35 targeted deletion strains. We identified a high-confidence set of 150 genes affecting lipid accumulation, including genes with predicted function in signaling cascades, gene expression, protein modification and vesicular trafficking, autophagy, amino acid synthesis and tRNA modification, and genes of unknown function. These results greatly advance our understanding of lipid metabolism in this oleaginous species and demonstrate a general approach for barcoded mutagenesis that should enable functional genomics in diverse fungi. PMID:29521624

  17. Fast and robust methods for full genome sequencing of Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) Type 1 and Type 2

    DEFF Research Database (Denmark)

    Kvisgaard, Lise Kirstine; Hjulsager, Charlotte Kristiane; Fahnøe, Ulrik

    . In the present study, fast and robust methods for long range RT-PCR amplification and subsequent next generation sequencing (NGS) of PRRSV Type 1 and Type 2 viruses were developed and validated on nine Type 1 and nine Type 2 PRRSV viruses. The methods were shown to generate robust and reliable sequences both...... on primary material and cell culture adapted viruses and the protocols were shown to perform well on all three NGS platforms tested (Roche 454 FLX, Illumina HiSeq 2000, and Ion Torrent PGM™ Sequencer). To complete the sequences at the 5’ end, 5’ Rapid Amplification of cDNA Ends (5’ RACE) was conducted...... followed by cycle sequencing of clones. The genome lengths were determined to be 14,876-15,098 and 15,342-15,408 nucleotides long for the Type 1 and Type 2 strains, respectively. These methods will greatly facilitate the generation of more complete genome PRRSV sequences globally which in turn may lead...

  18. FR-like EBNA1 binding repeats in the human genome

    International Nuclear Information System (INIS)

    D'Herouel, Aymeric Fouquier; Birgersdotter, Anna; Werner, Maria

    2010-01-01

    Epstein-Barr virus (EBV) is widely spread in the human population. EBV nuclear antigen 1 (EBNA1) is a transcription factor that activates viral genes and is necessary for viral replication and partitioning, which binds the EBV genome cooperatively. We identify similar EBNA1 repeat binding sites in the human genome using a nearest-neighbor positional weight matrix. Previously experimentally verified EBNA1 sites in the human genome are successfully recovered by our approach. Most importantly, 40 novel regions are identified in the human genome, constituted of tandemly repeated binding sites for EBNA1. Genes located in the vicinity of these regions are presented as possible targets for EBNA1-mediated regulation. Among these, four are discussed in more detail: IQCB1, IMPG1, IRF2BP2 and TPO. Incorporating the cooperative actions of EBNA1 is essential when identifying regulatory regions in the human genome and we believe the findings presented here are highly valuable for the understanding of EBV-induced phenotypic changes.

  19. The genome draft of coconut (Cocos nucifera).

    Science.gov (United States)

    Xiao, Yong; Xu, Pengwei; Fan, Haikuo; Baudouin, Luc; Xia, Wei; Bocs, Stéphanie; Xu, Junyang; Li, Qiong; Guo, Anping; Zhou, Lixia; Li, Jing; Wu, Yi; Ma, Zilong; Armero, Alix; Issali, Auguste Emmanuel; Liu, Na; Peng, Ming; Yang, Yaodong

    2017-11-01

    Coconut palm (Cocos nucifera,2n = 32), a member of genus Cocos and family Arecaceae (Palmaceae), is an important tropical fruit and oil crop. Currently, coconut palm is cultivated in 93 countries, including Central and South America, East and West Africa, Southeast Asia and the Pacific Islands, with a total growth area of more than 12 million hectares [1]. Coconut palm is generally classified into 2 main categories: "Tall" (flowering 8-10 years after planting) and "Dwarf" (flowering 4-6 years after planting), based on morphological characteristics and breeding habits. This Palmae species has a long growth period before reproductive years, which hinders conventional breeding progress. In spite of initial successes, improvements made by conventional breeding have been very slow. In the present study, we obtained de novo sequences of the Cocos nucifera genome: a major genomic resource that could be used to facilitate molecular breeding in Cocos nucifera and accelerate the breeding process in this important crop. A total of 419.67 gigabases (Gb) of raw reads were generated by the Illumina HiSeq 2000 platform using a series of paired-end and mate-pair libraries, covering the predicted Cocos nucifera genome length (2.42 Gb, variety "Hainan Tall") to an estimated ×173.32 read depth. A total scaffold length of 2.20 Gb was generated (N50 = 418 Kb), representing 90.91% of the genome. The coconut genome was predicted to harbor 28 039 protein-coding genes, which is less than in Phoenix dactylifera (PDK30: 28 889), Phoenix dactylifera (DPV01: 41 660), and Elaeis guineensis (EG5: 34 802). BUSCO evaluation demonstrated that the obtained scaffold sequences covered 90.8% of the coconut genome and that the genome annotation was 74.1% complete. Genome annotation results revealed that 72.75% of the coconut genome consisted of transposable elements, of which long-terminal repeat retrotransposons elements (LTRs) accounted for the largest proportion (92.23%). Comparative analysis of the

  20. Complete genome sequence of the Pectobacterium carotovorum subsp. carotovorum virulent bacteriophage PM1.

    Science.gov (United States)

    Lim, Jeong-A; Shin, Hakdong; Lee, Dong Hwan; Han, Sang-Wook; Lee, Ju-Hoon; Ryu, Sangryeol; Heu, Sunggi

    2014-08-01

    PM1, a novel virulent bacteriophage that infects Pectobacterium carotovorum subsp. carotovorum, was isolated. Its morphological features were examined by electron microscopy, which indicated that this phage belongs to the family Myoviridae. It has a 55,098-bp genome, including a 2,665-bp terminal repeat. A total of 63 open reading frames (ORFs) were predicted, but only 20 ORFs possessed homology with functional proteins. There is one tRNA coding region, and the GC-content of the genome is 44.9 %. Most ORFs in bacteriophage PM1 showed high homology to enterobacteria phage ΦEcoM-GJ1 and Erwinia phage νB EamM-Y2. Like these bacteriophages, PM1 encodes an RNA polymerase, which is a hallmark of T7-like phages. There is no integrase or repressor, suggesting that PM1 is a virulent bacteriophage.

  1. Brain Genomics Superstruct Project initial data release with structural, functional, and behavioral measures.

    Science.gov (United States)

    Holmes, Avram J; Hollinshead, Marisa O; O'Keefe, Timothy M; Petrov, Victor I; Fariello, Gabriele R; Wald, Lawrence L; Fischl, Bruce; Rosen, Bruce R; Mair, Ross W; Roffman, Joshua L; Smoller, Jordan W; Buckner, Randy L

    2015-01-01

    The goal of the Brain Genomics Superstruct Project (GSP) is to enable large-scale exploration of the links between brain function, behavior, and ultimately genetic variation. To provide the broader scientific community data to probe these associations, a repository of structural and functional magnetic resonance imaging (MRI) scans linked to genetic information was constructed from a sample of healthy individuals. The initial release, detailed in the present manuscript, encompasses quality screened cross-sectional data from 1,570 participants ages 18 to 35 years who were scanned with MRI and completed demographic and health questionnaires. Personality and cognitive measures were obtained on a subset of participants. Each dataset contains a T1-weighted structural MRI scan and either one (n=1,570) or two (n=1,139) resting state functional MRI scans. Test-retest reliability datasets are included from 69 participants scanned within six months of their initial visit. For the majority of participants self-report behavioral and cognitive measures are included (n=926 and n=892 respectively). Analyses of data quality, structure, function, personality, and cognition are presented to demonstrate the dataset's utility.

  2. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Directory of Open Access Journals (Sweden)

    Enis Afgan

    Full Text Available Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise.We designed and implemented the Genomics Virtual Laboratory (GVL as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic.This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints

  3. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

    Science.gov (United States)

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and logistical constraints, and explore the

  4. Evolutionary Fates and Dynamic Functionalization of Young Duplicate Genes in Arabidopsis Genomes.

    Science.gov (United States)

    Wang, Jun; Tao, Feng; Marowsky, Nicholas C; Fan, Chuanzhu

    2016-09-01

    Gene duplication is a primary means to generate genomic novelties, playing an essential role in speciation and adaptation. Particularly in plants, a high abundance of duplicate genes has been maintained for significantly long periods of evolutionary time. To address the manner in which young duplicate genes were derived primarily from small-scale gene duplication and preserved in plant genomes and to determine the underlying driving mechanisms, we generated transcriptomes to produce the expression profiles of five tissues in Arabidopsis thaliana and the closely related species Arabidopsis lyrata and Capsella rubella Based on the quantitative analysis metrics, we investigated the evolutionary processes of young duplicate genes in Arabidopsis. We determined that conservation, neofunctionalization, and specialization are three main evolutionary processes for Arabidopsis young duplicate genes. We explicitly demonstrated the dynamic functionalization of duplicate genes along the evolutionary time scale. Upon origination, duplicates tend to maintain their ancestral functions; but as they survive longer, they might be likely to develop distinct and novel functions. The temporal evolutionary processes and functionalization of plant duplicate genes are associated with their ancestral functions, dynamic DNA methylation levels, and histone modification abundances. Furthermore, duplicate genes tend to be initially expressed in pollen and then to gain more interaction partners over time. Altogether, our study provides novel insights into the dynamic retention processes of young duplicate genes in plant genomes. © 2016 American Society of Plant Biologists. All rights reserved.

  5. GenomeRunner web server: regulatory similarity and differences define the functional impact of SNP sets.

    Science.gov (United States)

    Dozmorov, Mikhail G; Cara, Lukas R; Giles, Cory B; Wren, Jonathan D

    2016-08-01

    The growing amount of regulatory data from the ENCODE, Roadmap Epigenomics and other consortia provides a wealth of opportunities to investigate the functional impact of single nucleotide polymorphisms (SNPs). Yet, given the large number of regulatory datasets, researchers are posed with a challenge of how to efficiently utilize them to interpret the functional impact of SNP sets. We developed the GenomeRunner web server to automate systematic statistical analysis of SNP sets within a regulatory context. Besides defining the functional impact of SNP sets, GenomeRunner implements novel regulatory similarity/differential analyses, and cell type-specific regulatory enrichment analysis. Validated against literature- and disease ontology-based approaches, analysis of 39 disease/trait-associated SNP sets demonstrated that the functional impact of SNP sets corresponds to known disease relationships. We identified a group of autoimmune diseases with SNPs distinctly enriched in the enhancers of T helper cell subpopulations, and demonstrated relevant cell type-specificity of the functional impact of other SNP sets. In summary, we show how systematic analysis of genomic data within a regulatory context can help interpreting the functional impact of SNP sets. GenomeRunner web server is freely available at http://www.integrativegenomics.org/ mikhail.dozmorov@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Genome Sequence of the Freshwater Yangtze Finless Porpoise.

    Science.gov (United States)

    Yuan, Yuan; Zhang, Peijun; Wang, Kun; Liu, Mingzhong; Li, Jing; Zheng, Jingsong; Wang, Ding; Xu, Wenjie; Lin, Mingli; Dong, Lijun; Zhu, Chenglong; Qiu, Qiang; Li, Songhai

    2018-04-16

    The Yangtze finless porpoise ( Neophocaena asiaeorientalis ssp. asiaeorientalis ) is a subspecies of the narrow-ridged finless porpoise ( N. asiaeorientalis ). In total, 714.28 gigabases (Gb) of raw reads were generated by whole-genome sequencing of the Yangtze finless porpoise, using an Illumina HiSeq 2000 platform. After filtering the low-quality and duplicated reads, we assembled a draft genome of 2.22 Gb, with contig N50 and scaffold N50 values of 46.69 kilobases (kb) and 1.71 megabases (Mb), respectively. We identified 887.63 Mb of repetitive sequences and predicted 18,479 protein-coding genes in the assembled genome. The phylogenetic tree showed a relationship between the Yangtze finless porpoise and the Yangtze River dolphin, which diverged approximately 20.84 million years ago. In comparisons with the genomes of 10 other mammals, we detected 44 species-specific gene families, 164 expanded gene families, and 313 positively selected genes in the Yangtze finless porpoise genome. The assembled genome sequence and underlying sequence data are available at the National Center for Biotechnology Information under BioProject accession number PRJNA433603.

  7. Platinum nanoparticles functionalized nitrogen doped graphene platform for sensitive electrochemical glucose biosensing

    International Nuclear Information System (INIS)

    Yang, Zhanjun; Cao, Yue; Li, Juan; Jian, Zhiqin; Zhang, Yongcai; Hu, Xiaoya

    2015-01-01

    Highlights: • An efficient PtNPs@NG nanocomposite was prepared for the immobilization of enzyme. • A novel electrochemical glucose biosensor was constructed based on this PtNPs@NG. • The proposed glucose biosensor showed high sensitivity and low detection limit. • The PtNPs@NG composite provided a promising platform for biosensing applications. - Abstract: In this work, we reported an efficient platinum nanoparticles functionalized nitrogen doped graphene (PtNPs@NG) nanocomposite for devising novel electrochemical glucose biosensor for the first time. The fabricated PtNPs@NG and biosensor were characterized using transmission electron microscopy, high-resolution transmission electron microscopy, X-ray photoelectron spectroscopy, static water contact angle, UV–vis spectroscopy, electrochemical impedance spectra and cyclic voltammetry, respectively. PtNPs@NG showed large surface area and excellent biocompatibility, and enhanced the direct electron transfer between enzyme molecules and electrode surface. The glucose oxidase (GOx) immobilized on PtNPs@NG nanocomposite retained its bioactivity, and exhibited a surface controlled, quasi-reversible and fast electron transfer process. The constructed glucose biosensor showed wide linear range from 0.005 to 1.1 mM with high sensitivity of 20.31 mA M −1 cm −2 . The detection limit was calculated to be 0.002 mM at signal-to-noise of 3, which showed 20-fold decrease in comparison with single NG-based electrochemical biosensor for glucose. The proposed glucose biosensor also demonstrated excellent selectivity, good reproducibility, acceptable stability, and could be successfully applied in the detection of glucose in serum samples at the applied potential of −0.33 V. This research provided a promising biosensing platform for the development of excellent electrochemical biosensors

  8. Platinum nanoparticles functionalized nitrogen doped graphene platform for sensitive electrochemical glucose biosensing

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Zhanjun, E-mail: zjyang@yzu.edu.cn; Cao, Yue; Li, Juan; Jian, Zhiqin; Zhang, Yongcai; Hu, Xiaoya

    2015-04-29

    Highlights: • An efficient PtNPs@NG nanocomposite was prepared for the immobilization of enzyme. • A novel electrochemical glucose biosensor was constructed based on this PtNPs@NG. • The proposed glucose biosensor showed high sensitivity and low detection limit. • The PtNPs@NG composite provided a promising platform for biosensing applications. - Abstract: In this work, we reported an efficient platinum nanoparticles functionalized nitrogen doped graphene (PtNPs@NG) nanocomposite for devising novel electrochemical glucose biosensor for the first time. The fabricated PtNPs@NG and biosensor were characterized using transmission electron microscopy, high-resolution transmission electron microscopy, X-ray photoelectron spectroscopy, static water contact angle, UV–vis spectroscopy, electrochemical impedance spectra and cyclic voltammetry, respectively. PtNPs@NG showed large surface area and excellent biocompatibility, and enhanced the direct electron transfer between enzyme molecules and electrode surface. The glucose oxidase (GOx) immobilized on PtNPs@NG nanocomposite retained its bioactivity, and exhibited a surface controlled, quasi-reversible and fast electron transfer process. The constructed glucose biosensor showed wide linear range from 0.005 to 1.1 mM with high sensitivity of 20.31 mA M{sup −1} cm{sup −2}. The detection limit was calculated to be 0.002 mM at signal-to-noise of 3, which showed 20-fold decrease in comparison with single NG-based electrochemical biosensor for glucose. The proposed glucose biosensor also demonstrated excellent selectivity, good reproducibility, acceptable stability, and could be successfully applied in the detection of glucose in serum samples at the applied potential of −0.33 V. This research provided a promising biosensing platform for the development of excellent electrochemical biosensors.

  9. CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics.

    Science.gov (United States)

    Gai, Xiaowu; Perin, Juan C; Murphy, Kevin; O'Hara, Ryan; D'arcy, Monica; Wenocur, Adam; Xie, Hongbo M; Rappaport, Eric F; Shaikh, Tamim H; White, Peter S

    2010-02-04

    Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. We developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated

  10. CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics

    Directory of Open Access Journals (Sweden)

    Rappaport Eric F

    2010-02-01

    Full Text Available Abstract Background Recent studies have shown that copy number variations (CNVs are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist. Results We developed a suite of software tools and resources (CNV Workshop for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV. Conclusions To our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and

  11. Genome-wide screening identifies a KCNIP1 copy number variant as a genetic predictor for atrial fibrillation

    Science.gov (United States)

    Tsai, Chia-Ti; Hsieh, Chia-Shan; Chang, Sheng-Nan; Chuang, Eric Y.; Ueng, Kwo-Chang; Tsai, Chin-Feng; Lin, Tsung-Hsien; Wu, Cho-Kai; Lee, Jen-Kuang; Lin, Lian-Yu; Wang, Yi-Chih; Yu, Chih-Chieh; Lai, Ling-Ping; Tseng, Chuen-Den; Hwang, Juey-Jen; Chiang, Fu-Tien; Lin, Jiunn-Lee

    2016-01-01

    Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Previous genome-wide association studies had identified single-nucleotide polymorphisms in several genomic regions to be associated with AF. In human genome, copy number variations (CNVs) are known to contribute to disease susceptibility. Using a genome-wide multistage approach to identify AF susceptibility CNVs, we here show a common 4,470-bp diallelic CNV in the first intron of potassium interacting channel 1 gene (KCNIP1) is strongly associated with AF in Taiwanese populations (odds ratio=2.27 for insertion allele; P=6.23 × 10−24). KCNIP1 insertion is associated with higher KCNIP1 mRNA expression. KCNIP1-encoded protein potassium interacting channel 1 (KCHIP1) is physically associated with potassium Kv channels and modulates atrial transient outward current in cardiac myocytes. Overexpression of KCNIP1 results in inducible AF in zebrafish. In conclusions, a common CNV in KCNIP1 gene is a genetic predictor of AF risk possibly pointing to a functional pathway. PMID:26831368

  12. CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

    Science.gov (United States)

    Chen, Hao; Wang, Xiangfeng

    2013-09-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/.

  13. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity.

    Directory of Open Access Journals (Sweden)

    Nicolas Heslot

    Full Text Available Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L. genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

  14. Impact of Marker Ascertainment Bias on Genomic Selection Accuracy and Estimates of Genetic Diversity

    Science.gov (United States)

    Heslot, Nicolas; Rutkoski, Jessica; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

    2013-01-01

    Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis. PMID:24040295

  15. Zebrafish models for the functional genomics of neurogenetic disorders.

    Science.gov (United States)

    Kabashi, Edor; Brustein, Edna; Champagne, Nathalie; Drapeau, Pierre

    2011-03-01

    In this review, we consider recent work using zebrafish to validate and study the functional consequences of mutations of human genes implicated in a broad range of degenerative and developmental disorders of the brain and spinal cord. Also we present technical considerations for those wishing to study their own genes of interest by taking advantage of this easily manipulated and clinically relevant model organism. Zebrafish permit mutational analyses of genetic function (gain or loss of function) and the rapid validation of human variants as pathological mutations. In particular, neural degeneration can be characterized at genetic, cellular, functional, and behavioral levels. Zebrafish have been used to knock down or express mutations in zebrafish homologs of human genes and to directly express human genes bearing mutations related to neurodegenerative disorders such as spinal muscular atrophy, ataxia, hereditary spastic paraplegia, amyotrophic lateral sclerosis (ALS), epilepsy, Huntington's disease, Parkinson's disease, fronto-temporal dementia, and Alzheimer's disease. More recently, we have been using zebrafish to validate mutations of synaptic genes discovered by large-scale genomic approaches in developmental disorders such as autism, schizophrenia, and non-syndromic mental retardation. Advances in zebrafish genetics such as multigenic analyses and chemical genetics now offer a unique potential for disease research. Thus, zebrafish hold much promise for advancing the functional genomics of human diseases, the understanding of the genetics and cell biology of degenerative and developmental disorders, and the discovery of therapeutics. This article is part of a Special Issue entitled Zebrafish Models of Neurological Diseases. Copyright © 2010 Elsevier B.V. All rights reserved.

  16. MC-GenomeKey: a multicloud system for the detection and annotation of genomic variants.

    Science.gov (United States)

    Elshazly, Hatem; Souilmi, Yassine; Tonellato, Peter J; Wall, Dennis P; Abouelhoda, Mohamed

    2017-01-20

    Next Generation Genome sequencing techniques became affordable for massive sequencing efforts devoted to clinical characterization of human diseases. However, the cost of providing cloud-based data analysis of the mounting datasets remains a concerning bottleneck for providing cost-effective clinical services. To address this computational problem, it is important to optimize the variant analysis workflow and the used analysis tools to reduce the overall computational processing time, and concomitantly reduce the processing cost. Furthermore, it is important to capitalize on the use of the recent development in the cloud computing market, which have witnessed more providers competing in terms of products and prices. In this paper, we present a new package called MC-GenomeKey (Multi-Cloud GenomeKey) that efficiently executes the variant analysis workflow for detecting and annotating mutations using cloud resources from different commercial cloud providers. Our package supports Amazon, Google, and Azure clouds, as well as, any other cloud platform based on OpenStack. Our package allows different scenarios of execution with different levels of sophistication, up to the one where a workflow can be executed using a cluster whose nodes come from different clouds. MC-GenomeKey also supports scenarios to exploit the spot instance model of Amazon in combination with the use of other cloud platforms to provide significant cost reduction. To the best of our knowledge, this is the first solution that optimizes the execution of the workflow using computational resources from different cloud providers. MC-GenomeKey provides an efficient multicloud based solution to detect and annotate mutations. The package can run in different commercial cloud platforms, which enables the user to seize the best offers. The package also provides a reliable means to make use of the low-cost spot instance model of Amazon, as it provides an efficient solution to the sudden termination of spot

  17. Genome editing: the road of CRISPR/Cas9 from bench to clinic

    KAUST Repository

    Eid, Ayman

    2016-10-14

    Molecular scissors engineered for site-specific modification of the genome hold great promise for effective functional analyses of genes, genomes and epigenomes and could improve our understanding of the molecular underpinnings of disease states and facilitate novel therapeutic applications. Several platforms for molecular scissors that enable targeted genome engineering have been developed, including zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and, most recently, clustered regularly interspaced palindromic repeats (CRISPR)/CRISPR-associated-9 (Cas9). The CRISPR/Cas9 system\\'s simplicity, facile engineering and amenability to multiplexing make it the system of choice for many applications. CRISPR/Cas9 has been used to generate disease models to study genetic diseases. Improvements are urgently needed for various aspects of the CRISPR/Cas9 system, including the system\\'s precision, delivery and control over the outcome of the repair process. Here, we discuss the current status of genome engineering and its implications for the future of biological research and gene therapy.

  18. Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework.

    Science.gov (United States)

    Li, Miaoxin; Li, Jiang; Li, Mulin Jun; Pan, Zhicheng; Hsu, Jacob Shujui; Liu, Dajiang J; Zhan, Xiaowei; Wang, Junwen; Song, Youqiang; Sham, Pak Chung

    2017-05-19

    Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. A novel ATM-dependent checkpoint defect distinct from loss of function mutation promotes genomic instability in melanoma.

    Science.gov (United States)

    Spoerri, Loredana; Brooks, Kelly; Chia, KeeMing; Grossman, Gavriel; Ellis, Jonathan J; Dahmer-Heath, Mareike; Škalamera, Dubravka; Pavey, Sandra; Burmeister, Bryan; Gabrielli, Brian

    2016-05-01

    Melanomas have high levels of genomic instability that can contribute to poor disease prognosis. Here, we report a novel defect of the ATM-dependent cell cycle checkpoint in melanoma cell lines that promotes genomic instability. In defective cells, ATM signalling to CHK2 is intact, but the cells are unable to maintain the cell cycle arrest due to elevated PLK1 driving recovery from the arrest. Reducing PLK1 activity recovered the ATM-dependent checkpoint arrest, and over-expressing PLK1 was sufficient to overcome the checkpoint arrest and increase genomic instability. Loss of the ATM-dependent checkpoint did not affect sensitivity to ionizing radiation demonstrating that this defect is distinct from ATM loss of function mutations. The checkpoint defective melanoma cell lines over-express PLK1, and a significant proportion of melanomas have high levels of PLK1 over-expression suggesting this defect is a common feature of melanomas. The inability of ATM to impose a cell cycle arrest in response to DNA damage increases genomic instability. This work also suggests that the ATM-dependent checkpoint arrest is likely to be defective in a higher proportion of cancers than previously expected. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  20. Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling.

    Science.gov (United States)

    Chen, Xin; Hu, Xiaohua; Lim, Tze Y; Shen, Xiajiong; Park, E K; Rosen, Gail L

    2012-01-01

    In this paper, we present a method that enable both homology-based approach and composition-based approach to further study the functional core (i.e., microbial core and gene core, correspondingly). In the proposed method, the identification of major functionality groups is achieved by generative topic modeling, which is able to extract useful information from unlabeled data. We first show that generative topic model can be used to model the taxon abundance information obtained by homology-based approach and study the microbial core. The model considers each sample as a “document,” which has a mixture of functional groups, while each functional group (also known as a “latent topic”) is a weight mixture of species. Therefore, estimating the generative topic model for taxon abundance data will uncover the distribution over latent functions (latent topic) in each sample. Second, we show that, generative topic model can also be used to study the genome-level composition of “N-mer” features (DNA subreads obtained by composition-based approaches). The model consider each genome as a mixture of latten genetic patterns (latent topics), while each functional pattern is a weighted mixture of the “N-mer” features, thus the existence of core genomes can be indicated by a set of common N-mer features. After studying the mutual information between latent topics and gene regions, we provide an explanation of the functional roles of uncovered latten genetic patterns. The experimental results demonstrate the effectiveness of proposed method.

  1. Latency Entry of Herpes Simplex Virus 1 Is Determined by the Interaction of Its Genome with the Nuclear Environment

    Science.gov (United States)

    Cohen, Camille; Streichenberger, Nathalie; Texier, Pascale; Takissian, Julie; Rousseau, Antoine; Poccardi, Nolwenn; Welsch, Jérémy; Corpet, Armelle; Schaeffer, Laurent; Labetoulle, Marc; Lomonte, Patrick

    2016-01-01

    Herpes simplex virus 1 (HSV-1) establishes latency in trigeminal ganglia (TG) sensory neurons of infected individuals. The commitment of infected neurons toward the viral lytic or latent transcriptional program is likely to depend on both viral and cellular factors, and to differ among individual neurons. In this study, we used a mouse model of HSV-1 infection to investigate the relationship between viral genomes and the nuclear environment in terms of the establishment of latency. During acute infection, viral genomes show two major patterns: replication compartments or multiple spots distributed in the nucleoplasm (namely “multiple-acute”). Viral genomes in the “multiple-acute” pattern are systematically associated with the promyelocytic leukemia (PML) protein in structures designated viral DNA-containing PML nuclear bodies (vDCP-NBs). To investigate the viral and cellular features that favor the acquisition of the latency-associated viral genome patterns, we infected mouse primary TG neurons from wild type (wt) mice or knock-out mice for type 1 interferon (IFN) receptor with wt or a mutant HSV-1, which is unable to replicate due to the synthesis of a non-functional ICP4, the major virus transactivator. We found that the inability of the virus to initiate the lytic program combined to its inability to synthesize a functional ICP0, are the two viral features leading to the formation of vDCP-NBs. The formation of the “multiple-latency” pattern is favored by the type 1 IFN signaling pathway in the context of neurons infected by a virus able to replicate through the expression of a functional ICP4 but unable to express functional VP16 and ICP0. Analyses of TGs harvested from HSV-1 latently infected humans showed that viral genomes and PML occupy similar nuclear areas in infected neurons, eventually forming vDCP-NB-like structures. Overall our study designates PML protein and PML-NBs to be major cellular components involved in the control of HSV-1 latency

  2. Comparison of Comparative Genomic Hybridization Technologies across Microarray Platforms

    Science.gov (United States)

    In the 2007 Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) project, we analyzed HL-60 DNA with five platforms: Agilent, Affymetrix 500K, Affymetrix U133 Plus 2.0, Illumina, and RPCI 19K BAC arrays. Copy number variation (CNV) was analyzed ...

  3. Exploring a Tomato Landraces Collection for Fruit-Related Traits by the Aid of a High-Throughput Genomic Platform.

    Science.gov (United States)

    Sacco, Adriana; Ruggieri, Valentino; Parisi, Mario; Festa, Giovanna; Rigano, Maria Manuela; Picarella, Maurizio Enea; Mazzucato, Andrea; Barone, Amalia

    2015-01-01

    During its evolution and domestication Solanum lycopersicum has undergone various genetic 'bottlenecks' and extreme inbreeding of limited genotypes. In Europe the tomato found a secondary centre for diversification, which resulted in a wide array of fruit shape variation given rise to a range of landraces that have been cultivated for centuries. Landraces represent a reservoir of genetic diversity especially for traits such as abiotic stress resistance and high fruit quality. Information about the variation present among tomato landrace populations is still limited. A collection of 123 genotypes from different geographical areas was established with the aim of capturing a wide diversity. Eighteen morphological traits were evaluated, mainly related to the fruit. About 45% of morphological variation was attributed to fruit shape, as estimated by the principal component analysis, and the dendrogram of relatedness divided the population in subgroups mainly on the basis of fruit weight and locule number. Genotyping was carried out using the tomato array platform SolCAP able to interrogate 7,720 SNPs. In the whole collection 87.1% markers were polymorphic but they decreased to 44-54% when considering groups of genotypes with different origin. The neighbour-joining tree analysis clustered the 123 genotypes into two main branches. The STRUCTURE analysis with K = 3 also divided the population on the basis of fruit size. A genomic-wide association strategy revealed 36 novel markers associated to the variation of 15 traits. The markers were mapped on the tomato chromosomes together with 98 candidate genes for the traits analyzed. Six regions were evidenced in which candidate genes co-localized with 19 associated SNPs. In addition, 17 associated SNPs were localized in genomic regions lacking candidate genes. The identification of these markers demonstrated that novel variability was captured in our germoplasm collection. They might also provide a viable indirect selection tool

  4. Exploring a Tomato Landraces Collection for Fruit-Related Traits by the Aid of a High-Throughput Genomic Platform.

    Directory of Open Access Journals (Sweden)

    Adriana Sacco

    Full Text Available During its evolution and domestication Solanum lycopersicum has undergone various genetic 'bottlenecks' and extreme inbreeding of limited genotypes. In Europe the tomato found a secondary centre for diversification, which resulted in a wide array of fruit shape variation given rise to a range of landraces that have been cultivated for centuries. Landraces represent a reservoir of genetic diversity especially for traits such as abiotic stress resistance and high fruit quality. Information about the variation present among tomato landrace populations is still limited. A collection of 123 genotypes from different geographical areas was established with the aim of capturing a wide diversity. Eighteen morphological traits were evaluated, mainly related to the fruit. About 45% of morphological variation was attributed to fruit shape, as estimated by the principal component analysis, and the dendrogram of relatedness divided the population in subgroups mainly on the basis of fruit weight and locule number. Genotyping was carried out using the tomato array platform SolCAP able to interrogate 7,720 SNPs. In the whole collection 87.1% markers were polymorphic but they decreased to 44-54% when considering groups of genotypes with different origin. The neighbour-joining tree analysis clustered the 123 genotypes into two main branches. The STRUCTURE analysis with K = 3 also divided the population on the basis of fruit size. A genomic-wide association strategy revealed 36 novel markers associated to the variation of 15 traits. The markers were mapped on the tomato chromosomes together with 98 candidate genes for the traits analyzed. Six regions were evidenced in which candidate genes co-localized with 19 associated SNPs. In addition, 17 associated SNPs were localized in genomic regions lacking candidate genes. The identification of these markers demonstrated that novel variability was captured in our germoplasm collection. They might also provide a viable

  5. Fluidic Logic Used in a Systems Approach to Enable Integrated Single-cell Functional Analysis

    Directory of Open Access Journals (Sweden)

    Naveen Ramalingam

    2016-09-01

    Full Text Available The study of single cells has evolved over the past several years to include expression and genomic analysis of an increasing number of single cells. Several studies have demonstrated wide-spread variation and heterogeneity within cell populations of similar phenotype. While the characterization of these populations will likely set the foundation for our understanding of genomic- and expression-based diversity, it will not be able to link the functional differences of a single cell to its underlying genomic structure and activity. Currently, it is difficult to perturb single cells in a controlled environment, monitor and measure the response due to perturbation, and link these response measurements to downstream genomic and transcriptomic analysis. In order to address this challenge, we developed a platform to integrate and miniaturize many of the experimental steps required to study single-cell function. The heart of this platform is an elastomer-based Integrated Fluidic Circuit (IFC that uses fluidic logic to select and sequester specific single cells based on a phenotypic trait for downstream experimentation. Experiments with sequestered cells that have been performed include on-chip culture, exposure to a variety of stimulants, and post-exposure image-based response analysis, followed by preparation of the mRNA transcriptome for massively parallel sequencing analysis. The flexible system embodies experimental design and execution that enable routine functional studies of single cells.

  6. Snf2 family gene distribution in higher plant genomes reveals DRD1 expansion and diversification in the tomato genome.

    Directory of Open Access Journals (Sweden)

    Joachim W Bargsten

    Full Text Available As part of large protein complexes, Snf2 family ATPases are responsible for energy supply during chromatin remodeling, but the precise mechanism of action of many of these proteins is largely unknown. They influence many processes in plants, such as the response to environmental stress. This analysis is the first comprehensive study of Snf2 family ATPases in plants. We here present a comparative analysis of 1159 candidate plant Snf2 genes in 33 complete and annotated plant genomes, including two green algae. The number of Snf2 ATPases shows considerable variation across plant genomes (17-63 genes. The DRD1, Rad5/16 and Snf2 subfamily members occur most often. Detailed analysis of the plant-specific DRD1 subfamily in related plant genomes shows the occurrence of a complex series of evolutionary events. Notably tomato carries unexpected gene expansions of DRD1 gene members. Most of these genes are expressed in tomato, although at low levels and with distinct tissue or organ specificity. In contrast, the Snf2 subfamily genes tend to be expressed constitutively in tomato. The results underpin and extend the Snf2 subfamily classification, which could help to determine the various functional roles of Snf2 ATPases and to target environmental stress tolerance and yield in future breeding.

  7. The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

    Science.gov (United States)

    Treangen, Todd J; Ondov, Brian D; Koren, Sergey; Phillippy, Adam M

    2014-01-01

    Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.

  8. A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. | Office of Cancer Genomics

    Science.gov (United States)

    Functional genomics efforts face tradeoffs between number of perturbations examined and complexity of phenotypes measured. We bridge this gap with Perturb-seq, which combines droplet-based single-cell RNA-seq with a strategy for barcoding CRISPR-mediated perturbations, allowing many perturbations to be profiled in pooled format. We applied Perturb-seq to dissect the mammalian unfolded protein response (UPR) using single and combinatorial CRISPR perturbations. Two genome-scale CRISPR interference (CRISPRi) screens identified genes whose repression perturbs ER homeostasis.

  9. Genome-Wide Analysis of PAPS1-Dependent Polyadenylation Identifies Novel Roles for Functionally Specialized Poly(A Polymerases in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Christian Kappel

    2015-08-01

    Full Text Available The poly(A tail at 3' ends of eukaryotic mRNAs promotes their nuclear export, stability and translational efficiency, and changes in its length can strongly impact gene expression. The Arabidopsis thaliana genome encodes three canonical nuclear poly(A polymerases, PAPS1, PAPS2 and PAPS4. As shown by their different mutant phenotypes, these three isoforms are functionally specialized, with PAPS1 modifying organ growth and suppressing a constitutive immune response. However, the molecular basis of this specialization is largely unknown. Here, we have estimated poly(A-tail lengths on a transcriptome-wide scale in wild-type and paps1 mutants. This identified categories of genes as particularly strongly affected in paps1 mutants, including genes encoding ribosomal proteins, cell-division factors and major carbohydrate-metabolic proteins. We experimentally verified two novel functions of PAPS1 in ribosome biogenesis and redox homoeostasis that were predicted based on the analysis of poly(A-tail length changes in paps1 mutants. When overlaying the PAPS1-dependent effects observed here with coexpression analysis based on independent microarray data, the two clusters of transcripts that are most closely coexpressed with PAPS1 show the strongest change in poly(A-tail length and transcript abundance in paps1 mutants in our analysis. This suggests that their coexpression reflects at least partly the preferential polyadenylation of these transcripts by PAPS1 versus the other two poly(A-polymerase isoforms. Thus, transcriptome-wide analysis of poly(A-tail lengths identifies novel biological functions and likely target transcripts for polyadenylation by PAPS1. Data integration with large-scale co-expression data suggests that changes in the relative activities of the isoforms are used as an endogenous mechanism to co-ordinately modulate plant gene expression.

  10. Systems Biological Determination of the Epi-Genomic Structure Function Relation: : Nucleosomal Association Changes, Intra/Inter Chromosomal Architecture, Transcriptional Structure Relationship, Simulations of Nucleosomal/Chromatin Fiber/Chromosome Architecture and Dynamics, System Biological/Medical Result Integration via the GLOBE 3D Genome Platform.

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); P.R. Cook (Peter); K. Rippe (Karsten); Gernot Längst; G. Wedemann (Gero); F.G. Grosveld (Frank)

    2010-01-01

    textabstractDespite our knowledge of the sequence of the human genome, the relation of its three-dimensional dynamic architecture with its function – the storage and expression of genetic information – remains one of the central unresolved issues of our age. It became very clear meanwhile that this

  11. Systematic Functional Interrogation of Rare Cancer Variants Identifies Oncogenic Alleles | Office of Cancer Genomics

    Science.gov (United States)

    Cancer genome characterization efforts now provide an initial view of the somatic alterations in primary tumors. However, most point mutations occur at low frequency, and the function of these alleles remains undefined. We have developed a scalable systematic approach to interrogate the function of cancer-associated gene variants. We subjected 474 mutant alleles curated from 5,338 tumors to pooled in vivo tumor formation assays and gene expression profiling. We identified 12 transforming alleles, including two in genes (PIK3CB, POT1) that have not been shown to be tumorigenic.

  12. Genome sequence of herpes simplex virus 1 strain KOS.

    Science.gov (United States)

    Macdonald, Stuart J; Mostafa, Heba H; Morrison, Lynda A; Davido, David J

    2012-06-01

    Herpes simplex virus type 1 (HSV-1) strain KOS has been extensively used in many studies to examine HSV-1 replication, gene expression, and pathogenesis. Notably, strain KOS is known to be less pathogenic than the first sequenced genome of HSV-1, strain 17. To understand the genotypic differences between KOS and other phenotypically distinct strains of HSV-1, we sequenced the viral genome of strain KOS. When comparing strain KOS to strain 17, there are at least 1,024 small nucleotide polymorphisms (SNPs) and 172 insertions/deletions (indels). The polymorphisms observed in the KOS genome will likely provide insights into the genes, their protein products, and the cis elements that regulate the biology of this HSV-1 strain.

  13. An Ontology-Based GIS for Genomic Data Management of Rumen Microbes

    Directory of Open Access Journals (Sweden)

    Saber Jelokhani-Niaraki

    2015-03-01

    Full Text Available During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.

  14. An Ontology-Based GIS for Genomic Data Management of Rumen Microbes

    Science.gov (United States)

    Jelokhani-Niaraki, Saber; Minuchehr, Zarrin; Nassiri, Mohammad Reza

    2015-01-01

    During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data. PMID:25873847

  15. Cloud-based interactive analytics for terabytes of genomic variants data.

    Science.gov (United States)

    Pan, Cuiping; McInnes, Gregory; Deflaux, Nicole; Snyder, Michael; Bingham, Jonathan; Datta, Somalee; Tsao, Philip S

    2017-12-01

    Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired. We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information. Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs. cuiping@stanford.edu or ptsao@stanford.edu. Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2017. This work is written by US Government employees and are in the public domain in the US.

  16. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    Science.gov (United States)

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  18. Functional genomic characterization of virulence factors from necrotizing fasciitis-causing strains of Aeromonas hydrophila.

    Science.gov (United States)

    Grim, Christopher J; Kozlova, Elena V; Ponnusamy, Duraisamy; Fitts, Eric C; Sha, Jian; Kirtley, Michelle L; van Lier, Christina J; Tiner, Bethany L; Erova, Tatiana E; Joseph, Sandeep J; Read, Timothy D; Shak, Joshua R; Joseph, Sam W; Singletary, Ed; Felland, Tracy; Baze, Wallace B; Horneman, Amy J; Chopra, Ashok K

    2014-07-01

    The genomes of 10 Aeromonas isolates identified and designated Aeromonas hydrophila WI, Riv3, and NF1 to NF4; A. dhakensis SSU; A. jandaei Riv2; and A. caviae NM22 and NM33 were sequenced and annotated. Isolates NF1 to NF4 were from a patient with necrotizing fasciitis (NF). Two environmental isolates (Riv2 and -3) were from the river water from which the NF patient acquired the infection. While isolates NF2 to NF4 were clonal, NF1 was genetically distinct. Outside the conserved core genomes of these 10 isolates, several unique genomic features were identified. The most virulent strains possessed one of the following four virulence factors or a combination of them: cytotoxic enterotoxin, exotoxin A, and type 3 and 6 secretion system effectors AexU and Hcp. In a septicemic-mouse model, SSU, NF1, and Riv2 were the most virulent, while NF2 was moderately virulent. These data correlated with high motility and biofilm formation by the former three isolates. Conversely, in a mouse model of intramuscular infection, NF2 was much more virulent than NF1. Isolates NF2, SSU, and Riv2 disseminated in high numbers from the muscular tissue to the visceral organs of mice, while NF1 reached the liver and spleen in relatively lower numbers on the basis of colony counting and tracking of bioluminescent strains in real time by in vivo imaging. Histopathologically, degeneration of myofibers with significant infiltration of polymorphonuclear cells due to the highly virulent strains was noted. Functional genomic analysis provided data that allowed us to correlate the highly infectious nature of Aeromonas pathotypes belonging to several different species with virulence signatures and their potential ability to cause NF. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  19. Analysis of CR1 Repeats in the Zebra Finch Genome

    Directory of Open Access Journals (Sweden)

    George E. Liu

    2013-06-01

    Full Text Available Most bird species have smaller genomes and fewer repeats than mammals. Chicken Repeat 1 (CR1 repeat is one of the most abundant families of repeats, ranging from ~133,000 to ~187,000 copies accounting for ~50 to ~80% of the interspersed repeats in the zebra finch and chicken genomes, respectively. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to multiple CR1 subfamilies in the chicken. In this study, we performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the zebra finch genome. We identified and validated 34 CR1 subfamilies and further analyzed the correlation between these subfamilies. We also discovered 4 novel lineage-specific CR1 subfamilies in the zebra finch when compared to the chicken genome. We built various evolutionary trees of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.

  20. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy.

  1. Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

    Science.gov (United States)

    Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

    2015-01-01

    Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.

  2. Single-cell genomics reveals pyrrolysine-encoding potential in members of uncultivated archaeal candidate division MSBL1

    KAUST Repository

    Guan, Yue

    2017-05-11

    Pyrrolysine (Pyl), the 22nd canonical amino acid, is only decoded and synthesized by a limited number of organisms in the domains Archaea and Bacteria. Pyl is encoded by the amber codon UAG, typically a stop codon. To date, all known Pyl-decoding archaea are able to carry out methylotrophic methanogenesis. The functionality of methylamine methyltransferases, an important component of corrinoid-dependent methyltransfer reactions, depends on the presence of Pyl. Here, we present a putative pyl gene cluster obtained from single-cell genomes of the archaeal Mediterranean Sea Brine Lakes group 1 (MSBL1) from the Red Sea. Functional annotation of the MSBL1 single cell amplified genomes (SAGs) also revealed a complete corrinoid-dependent methyl-transfer pathway suggesting that members of MSBL1 may possibly be capable of synthesizing Pyl and metabolizing methylated amines. This article is protected by copyright. All rights reserved.

  3. Single-cell genomics reveals pyrrolysine-encoding potential in members of uncultivated archaeal candidate division MSBL1

    KAUST Repository

    Guan, Yue; Haroon, Mohamed; Alam, Intikhab; Ferry, James G.; Stingl, Ulrich

    2017-01-01

    Pyrrolysine (Pyl), the 22nd canonical amino acid, is only decoded and synthesized by a limited number of organisms in the domains Archaea and Bacteria. Pyl is encoded by the amber codon UAG, typically a stop codon. To date, all known Pyl-decoding archaea are able to carry out methylotrophic methanogenesis. The functionality of methylamine methyltransferases, an important component of corrinoid-dependent methyltransfer reactions, depends on the presence of Pyl. Here, we present a putative pyl gene cluster obtained from single-cell genomes of the archaeal Mediterranean Sea Brine Lakes group 1 (MSBL1) from the Red Sea. Functional annotation of the MSBL1 single cell amplified genomes (SAGs) also revealed a complete corrinoid-dependent methyl-transfer pathway suggesting that members of MSBL1 may possibly be capable of synthesizing Pyl and metabolizing methylated amines. This article is protected by copyright. All rights reserved.

  4. Genome editing: progress and challenges for medical applications

    Directory of Open Access Journals (Sweden)

    Dana Carroll

    2016-11-01

    Full Text Available Editorial summary The development of the CRISPR-Cas platform for genome editing has greatly simplified the process of making targeted genetic modifications. Applications of genome editing are expected to have a substantial impact on human therapies through the development of better animal models, new target discovery, and direct therapeutic intervention.

  5. Product Platform Replacements

    DEFF Research Database (Denmark)

    Sköld, Martin; Karlsson, Christer

    2012-01-01

    . To shed light on this unexplored and growing managerial concern, the purpose of this explorative study is to identify operational challenges to management when product platforms are replaced. Design/methodology/approach – The study uses a longitudinal field-study approach. Two companies, Gamma and Omega...... replacement was chosen in each company. Findings – The study shows that platform replacements primarily challenge managers' existing knowledge about platform architectures. A distinction can be made between “width” and “height” in platform replacements, and it is crucial that managers observe this in order...... to challenge their existing knowledge about platform architectures. Issues on technologies, architectures, components and processes as well as on segments, applications and functions are identified. Practical implications – Practical implications are summarized and discussed in relation to a framework...

  6. Isolate, functionalize, and release: the ISOFURE platform for the functionalization of nanoparticles

    International Nuclear Information System (INIS)

    Chirra, Hariharasudhan D.; Spencer, David; Hilt, J. Zach

    2012-01-01

    Polymer-coated inorganic nanoparticles are widely studied as carrier systems for various biomedical applications. For their translation into clinical applications, it is critical that they remain in a stable, non-agglomerated state and are also produced at a high yield with ease. Herein, we introduce a novel methodology called the isolate-functionalize-release (ISOFURE) strategy to synthesize stable hydrogel-coated nanoparticles in the absence of a stabilizing reagent. As a model platform, we isolated and functionalized gold nanoparticles (GNPs) in a degradable poly(beta amino ester) (PBAE) hydrogel composite. Isolation was done either by adding GNPs to the PBAE hydrogel precursor solution or via in situ precipitation of GNPs inside the pre-synthesized PBAE hydrogel. Atom transfer radical polymerization (ATRP) was used as the model reaction to grow a temperature responsive poly(ethylene glycol) 600 dimethacrylate crosslinked poly(N-isopropyl acrylamide) hydrogel shell over the GNP core. Release of the temperature responsive hydrogel-coated GNPs was done by degrading the PBAE ISOFURE matrix in water. Characterization of the various nanoparticles using UV–Vis spectroscopy confirmed in situ precipitation of GNPs as well as ATRP over the GNPs. Light scattering confirmed the presence of temperature sensitive hydrogel shell over the GNP surface. Aging studies showed that the ISOFURE method based crosslinked hydrogel-coated GNPs showed higher stability and yield than that of the solution-based functionalized GNPs. The use of ISOFURE strategy shows promise as a simple tool for enhanced functionalization and stabilization over various nanocarriers under flexible experimental conditions.

  7. Insights into Conifer Giga-Genomes1

    Science.gov (United States)

    De La Torre, Amanda R.; Birol, Inanc; Bousquet, Jean; Ingvarsson, Pär K.; Jansson, Stefan; Jones, Steven J.M.; Keeling, Christopher I.; MacKay, John; Nilsson, Ove; Ritland, Kermit; Street, Nathaniel; Yanchuk, Alvin; Zerbe, Philipp; Bohlmann, Jörg

    2014-01-01

    Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world’s forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20–30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes. PMID:25349325

  8. CRISPR/Cas9 based genome editing of Penicillium chrysogenum

    NARCIS (Netherlands)

    Pohl, Carsten; Kiel, Jan A K W; Driessen, Arnold J M; Bovenberg, Roel A L; Nygård, Yvonne

    2016-01-01

    CRISPR/Cas9 based systems have emerged as versatile platforms for precision genome editing in a wide range of organisms. Here we have developed powerful CRISPR/Cas9 tools for marker-based and marker-free genome modifications in Penicillium chrysogenum, a model filamentous fungus and industrially

  9. STINGRAY: system for integrated genomic resources and analysis.

    Science.gov (United States)

    Wagner, Glauber; Jardim, Rodrigo; Tschoeke, Diogo A; Loureiro, Daniel R; Ocaña, Kary A C S; Ribeiro, Antonio C B; Emmel, Vanessa E; Probst, Christian M; Pitaluga, André N; Grisard, Edmundo C; Cavalcanti, Maria C; Campos, Maria L M; Mattoso, Marta; Dávila, Alberto M R

    2014-03-07

    The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/.

  10. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    Directory of Open Access Journals (Sweden)

    Quail Michael A

    2012-07-01

    Full Text Available Abstract Background Next generation sequencing (NGS technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support.

  11. Universal platform for quantitative analysis of DNA transposition

    Directory of Open Access Journals (Sweden)

    Pajunen Maria I

    2010-11-01

    Full Text Available Abstract Background Completed genome projects have revealed an astonishing diversity of transposable genetic elements, implying the existence of novel element families yet to be discovered from diverse life forms. Concurrently, several better understood transposon systems have been exploited as efficient tools in molecular biology and genomics applications. Characterization of new mobile elements and improvement of the existing transposition technology platforms warrant easy-to-use assays for the quantitative analysis of DNA transposition. Results Here we developed a universal in vivo platform for the analysis of transposition frequency with class II mobile elements, i.e., DNA transposons. For each particular transposon system, cloning of the transposon ends and the cognate transposase gene, in three consecutive steps, generates a multifunctional plasmid, which drives inducible expression of the transposase gene and includes a mobilisable lacZ-containing reporter transposon. The assay scores transposition events as blue microcolonies, papillae, growing within otherwise whitish Escherichia coli colonies on indicator plates. We developed the assay using phage Mu transposition as a test model and validated the platform using various MuA transposase mutants. For further validation and to illustrate universality, we introduced IS903 transposition system components into the assay. The developed assay is adjustable to a desired level of initial transposition via the control of a plasmid-borne E. coli arabinose promoter. In practice, the transposition frequency is modulated by varying the concentration of arabinose or glucose in the growth medium. We show that variable levels of transpositional activity can be analysed, thus enabling straightforward screens for hyper- or hypoactive transposase mutants, regardless of the original wild-type activity level. Conclusions The established universal papillation assay platform should be widely applicable to a

  12. CPSS: a computational platform for the analysis of small RNA deep sequencing data.

    Science.gov (United States)

    Zhang, Yuanwei; Xu, Bo; Yang, Yifan; Ban, Rongjun; Zhang, Huan; Jiang, Xiaohua; Cooke, Howard J; Xue, Yu; Shi, Qinghua

    2012-07-15

    Next generation sequencing (NGS) techniques have been widely used to document the small ribonucleic acids (RNAs) implicated in a variety of biological, physiological and pathological processes. An integrated computational tool is needed for handling and analysing the enormous datasets from small RNA deep sequencing approach. Herein, we present a novel web server, CPSS (a computational platform for the analysis of small RNA deep sequencing data), designed to completely annotate and functionally analyse microRNAs (miRNAs) from NGS data on one platform with a single data submission. Small RNA NGS data can be submitted to this server with analysis results being returned in two parts: (i) annotation analysis, which provides the most comprehensive analysis for small RNA transcriptome, including length distribution and genome mapping of sequencing reads, small RNA quantification, prediction of novel miRNAs, identification of differentially expressed miRNAs, piwi-interacting RNAs and other non-coding small RNAs between paired samples and detection of miRNA editing and modifications and (ii) functional analysis, including prediction of miRNA targeted genes by multiple tools, enrichment of gene ontology terms, signalling pathway involvement and protein-protein interaction analysis for the predicted genes. CPSS, a ready-to-use web server that integrates most functions of currently available bioinformatics tools, provides all the information wanted by the majority of users from small RNA deep sequencing datasets. CPSS is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/db/cpss/index.html or http://mcg.ustc.edu.cn/sdap1/cpss/index.html.

  13. Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae

    Science.gov (United States)

    Mukherjee, Munmun; Kakarla, Prathusha; Kumar, Sanath; Gonzalez, Esmeralda; Floyd, Jared T.; Inupakutika, Madhuri; Devireddy, Amith Reddy; Tirrell, Selena R.; Bruns, Merissa; He, Guixin; Lindquist, Ingrid E.; Sundararajan, Anitha; Schilkey, Faye D.; Mudge, Joann; Varela, Manuel F.

    2015-01-01

    Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain, in order to identify novel genes potentially responsible for virulence. In this study, we compared the whole genome of the non-O1 PS15 strain to the whole genomes of toxigenic serotypes at the phylogenetic level, and found that the PS15 genome was distantly related to those of toxigenic V. cholerae. Thus we focused on a detailed gene comparison between PS15 and the distantly related O1 V. cholerae N16961. Based on sequence alignment we tentatively assigned chromosome numbers 1 and 2 to elements within the genome of non-O1 V. cholerae PS15. Further, we found that PS15 and O1 V. cholerae N16961 shared 98% identity and 766 genes, but of the genes present in N16961 that were missing in the non-O1 V. cholerae PS15 genome, 56 were predicted to encode not only for virulence–related genes (colonization, antimicrobial resistance, and regulation of persister cells) but also genes involved in the metabolic biosynthesis of lipids, nucleosides and sulfur compounds. Additionally, we found 113 genes unique to PS15 that were predicted to encode other properties related to virulence, disease, defense, membrane transport, and DNA metabolism. Here, we identified distinctive and novel genomic elements between O1 and non-O1 V. cholerae genomes as potential virulence factors and, thus, targets for future therapeutics. Modulation of such novel targets may eventually enhance eradication efforts of endemic and pandemic disease cholera in afflicted nations. PMID:25722857

  14. Genomic and functional features of the biosurfactant producing Bacillus sp. AM13.

    Science.gov (United States)

    Shaligram, Shraddha; Kumbhare, Shreyas V; Dhotre, Dhiraj P; Muddeshwar, Manohar G; Kapley, Atya; Joseph, Neetha; Purohit, Hemant P; Shouche, Yogesh S; Pawar, Shrikant P

    2016-09-01

    Genomic studies provide deeper insights into secondary metabolites produced by diverse bacterial communities, residing in various environmental niches. This study aims to understand the potential of a biosurfactant producing Bacillus sp. AM13, isolated from soil. An integrated approach of genomic and chemical analysis was employed to characterize the antibacterial lipopeptide produced by the strain AM13. Genome analysis revealed that strain AM13 harbors a nonribosomal peptide synthetase (NRPS) cluster; highly similar with known biosynthetic gene clusters from surfactin family: lichenysin (85 %) and surfactin (78 %). These findings were substantiated with supplementary experiments of oil displacement assay and surface tension measurements, confirming the biosurfactant production. Further investigation using LCMS approach exhibited similarity of the biomolecule with biosurfactants of the surfactin family. Our consolidated effort of functional genomics provided chemical as well as genetic leads for understanding the biochemical characteristics of the bioactive compound.

  15. Genome Editing in Penicillium chrysogenum Using Cas9 Ribonucleoprotein Particles

    NARCIS (Netherlands)

    Pohl, Carsten; Mózsik, László; Driessen, Arnold J M; Bovenberg, Roel A L; Nygård, Yvonne I; Braman, Jeffrey Carl

    Several CRISPR/Cas9 tools have been recently established for precise genome editing in a wide range of filamentous fungi. This genome editing platform offers high flexibility in target selection and the possibility of introducing genetic deletions without the introduction of transgenic sequences .

  16. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. History and genomic sequence analysis of the herpes simplex virus 1 KOS and KOS1.1 sub-strains.

    Science.gov (United States)

    Colgrove, Robert C; Liu, Xueqiao; Griffiths, Anthony; Raja, Priya; Deluca, Neal A; Newman, Ruchi M; Coen, Donald M; Knipe, David M

    2016-01-01

    A collection of genomic DNA sequences of herpes simplex virus (HSV) strains has been defined and analyzed, and some information is available about genomic stability upon limited passage of viruses in culture. The nature of genomic change upon extensive laboratory passage remains to be determined. In this report we review the history of the HSV-1 KOS laboratory strain and the related KOS1.1 laboratory sub-strain, also called KOS (M), and determine the complete genomic sequence of an early passage stock of the KOS laboratory sub-strain and a laboratory stock of the KOS1.1 sub-strain. The genomes of the two sub-strains are highly similar with only five coding changes, 20 non-coding changes, and about twenty non-ORF sequence changes. The coding changes could potentially explain the KOS1.1 phenotypic properties of increased replication at high temperature and reduced neuroinvasiveness. The study also provides sequence markers to define the provenance of specific laboratory KOS virus stocks. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. EvoCor: a platform for predicting functionally related genes using phylogenetic and expression profiles.

    Science.gov (United States)

    Dittmar, W James; McIver, Lauren; Michalak, Pawel; Garner, Harold R; Valdez, Gregorio

    2014-07-01

    The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. A comparison between genotyping-by-sequencing and array-based scoring of SNPs for genomic prediction accuracy in winter wheat.

    Science.gov (United States)

    Elbasyoni, Ibrahim S; Lorenz, A J; Guttieri, M; Frels, K; Baenziger, P S; Poland, J; Akhunov, E

    2018-05-01

    The utilization of DNA molecular markers in plant breeding to maximize selection response via marker-assisted selection (MAS) and genomic selection (GS) has revolutionized plant breeding. A key factor affecting GS applicability is the choice of molecular marker platform. Genotyping-by-sequencing scored SNPs (GBS-scored SNPs) provides a large number of markers, albeit with high rates of missing data. Array scored SNPs are of high quality, but the cost per sample is substantially higher. The objectives of this study were 1) compare GBS-scored SNPs, and array scored SNPs for genomic selection applications, and 2) compare estimates of genomic kinship and population structure calculated using the two marker platforms. SNPs were compared in a diversity panel consisting of 299 hard winter wheat (Triticum aestivum L.) accessions that were part of a multi-year, multi-environments association mapping study. The panel was phenotyped in Ithaca, Nebraska for heading date, plant height, days to physiological maturity and grain yield in 2012 and 2013. The panel was genotyped using GBS-scored SNPs, and array scored SNPs. Results indicate that GBS-scored SNPs is comparable to or better than Array-scored SNPs for