WorldWideScience

Sample records for gene ontology reveal

  1. Gene Ontology

    Directory of Open Access Journals (Sweden)

    Gaston K. Mazandu

    2012-01-01

    Full Text Available The wide coverage and biological relevance of the Gene Ontology (GO, confirmed through its successful use in protein function prediction, have led to the growth in its popularity. In order to exploit the extent of biological knowledge that GO offers in describing genes or groups of genes, there is a need for an efficient, scalable similarity measure for GO terms and GO-annotated proteins. While several GO similarity measures exist, none adequately addresses all issues surrounding the design and usage of the ontology. We introduce a new metric for measuring the distance between two GO terms using the intrinsic topology of the GO-DAG, thus enabling the measurement of functional similarities between proteins based on their GO annotations. We assess the performance of this metric using a ROC analysis on human protein-protein interaction datasets and correlation coefficient analysis on the selected set of protein pairs from the CESSM online tool. This metric achieves good performance compared to the existing annotation-based GO measures. We used this new metric to assess functional similarity between orthologues, and show that it is effective at determining whether orthologues are annotated with similar functions and identifying cases where annotation is inconsistent between orthologues.

  2. Gene Ontology Consortium: going forward.

    Science.gov (United States)

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Exploring autophagy with Gene Ontology

    Science.gov (United States)

    2018-01-01

    ABSTRACT Autophagy is a fundamental cellular process that is well conserved among eukaryotes. It is one of the strategies that cells use to catabolize substances in a controlled way. Autophagy is used for recycling cellular components, responding to cellular stresses and ridding cells of foreign material. Perturbations in autophagy have been implicated in a number of pathological conditions such as neurodegeneration, cardiac disease and cancer. The growing knowledge about autophagic mechanisms needs to be collected in a computable and shareable format to allow its use in data representation and interpretation. The Gene Ontology (GO) is a freely available resource that describes how and where gene products function in biological systems. It consists of 3 interrelated structured vocabularies that outline what gene products do at the biochemical level, where they act in a cell and the overall biological objectives to which their actions contribute. It also consists of ‘annotations’ that associate gene products with the terms. Here we describe how we represent autophagy in GO, how we create and define terms relevant to autophagy researchers and how we interrelate those terms to generate a coherent view of the process, therefore allowing an interoperable description of its biological aspects. We also describe how annotation of gene products with GO terms improves data analysis and interpretation, hence bringing a significant benefit to this field of study. PMID:29455577

  4. Defining functional distances over Gene Ontology

    Directory of Open Access Journals (Sweden)

    del Pozo Angela

    2008-01-01

    Full Text Available Abstract Background A fundamental problem when trying to define the functional relationships between proteins is the difficulty in quantifying functional similarities, even when well-structured ontologies exist regarding the activity of proteins (i.e. 'gene ontology' -GO-. However, functional metrics can overcome the problems in the comparing and evaluating functional assignments and predictions. As a reference of proximity, previous approaches to compare GO terms considered linkage in terms of ontology weighted by a probability distribution that balances the non-uniform 'richness' of different parts of the Direct Acyclic Graph. Here, we have followed a different approach to quantify functional similarities between GO terms. Results We propose a new method to derive 'functional distances' between GO terms that is based on the simultaneous occurrence of terms in the same set of Interpro entries, instead of relying on the structure of the GO. The coincidence of GO terms reveals natural biological links between the GO functions and defines a distance model Df which fulfils the properties of a Metric Space. The distances obtained in this way can be represented as a hierarchical 'Functional Tree'. Conclusion The method proposed provides a new definition of distance that enables the similarity between GO terms to be quantified. Additionally, the 'Functional Tree' defines groups with biological meaning enhancing its utility for protein function comparison and prediction. Finally, this approach could be for function-based protein searches in databases, and for analysing the gene clusters produced by DNA array experiments.

  5. Gene Ontology-Based Analysis of Zebrafish Omics Data Using the Web Tool Comparative Gene Ontology.

    Science.gov (United States)

    Ebrahimie, Esmaeil; Fruzangohar, Mario; Moussavi Nik, Seyyed Hani; Newman, Morgan

    2017-10-01

    Gene Ontology (GO) analysis is a powerful tool in systems biology, which uses a defined nomenclature to annotate genes/proteins within three categories: "Molecular Function," "Biological Process," and "Cellular Component." GO analysis can assist in revealing functional mechanisms underlying observed patterns in transcriptomic, genomic, and proteomic data. The already extensive and increasing use of zebrafish for modeling genetic and other diseases highlights the need to develop a GO analytical tool for this organism. The web tool Comparative GO was originally developed for GO analysis of bacterial data in 2013 ( www.comparativego.com ). We have now upgraded and elaborated this web tool for analysis of zebrafish genetic data using GOs and annotations from the Gene Ontology Consortium.

  6. Transcriptome and Gene Ontology (GO) Enrichment Analysis Reveals Genes Involved in Biotin Metabolism That Affect L-Lysine Production in Corynebacterium glutamicum.

    Science.gov (United States)

    Kim, Hong-Il; Kim, Jong-Hyeon; Park, Young-Jin

    2016-03-09

    Corynebacterium glutamicum is widely used for amino acid production. In the present study, 543 genes showed a significant change in their mRNA expression levels in L-lysine-producing C. glutamicum ATCC21300 than that in the wild-type C. glutamicum ATCC13032. Among these 543 differentially expressed genes (DEGs), 28 genes were up- or downregulated. In addition, 454 DEGs were functionally enriched and categorized based on BLAST sequence homologies and gene ontology (GO) annotations using the Blast2GO software. Interestingly, NCgl0071 (bioB, encoding biotin synthase) was expressed at levels ~20-fold higher in the L-lysine-producing ATCC21300 strain than that in the wild-type ATCC13032 strain. Five other genes involved in biotin metabolism or transport--NCgl2515 (bioA, encoding adenosylmethionine-8-amino-7-oxononanoate aminotransferase), NCgl2516 (bioD, encoding dithiobiotin synthetase), NCgl1883, NCgl1884, and NCgl1885--were also expressed at significantly higher levels in the L-lysine-producing ATCC21300 strain than that in the wild-type ATCC13032 strain, which we determined using both next-generation RNA sequencing and quantitative real-time PCR analysis. When we disrupted the bioB gene in C. glutamicum ATCC21300, L-lysine production decreased by approximately 76%, and the three genes involved in biotin transport (NCgl1883, NCgl1884, and NCgl1885) were significantly downregulated. These results will be helpful to improve our understanding of C. glutamicum for industrial amino acid production.

  7. Transcriptome Analysis of Porcine PBMCs Reveals the Immune Cascade Response and Gene Ontology Terms Related to Cell Death and Fibrosis in the Progression of Liver Failure

    Directory of Open Access Journals (Sweden)

    YiMin Zhang

    2018-01-01

    Full Text Available Background. The key gene sets involved in the progression of acute liver failure (ALF, which has a high mortality rate, remain unclear. This study aims to gain a deeper understanding of the transcriptional response of peripheral blood mononuclear cells (PBMCs following ALF. Methods. ALF was induced by D-galactosamine (D-gal in a porcine model. PBMCs were separated at time zero (baseline group, 36 h (failure group, and 60 h (dying group after D-gal injection. Transcriptional profiling was performed using RNA sequencing and analysed using DAVID bioinformatics resources. Results. Compared with the baseline group, 816 and 1,845 differentially expressed genes (DEGs were identified in the failure and dying groups, respectively. A total of five and two gene ontology (GO term clusters were enriched in 107 GO terms in the failure group and 154 GO terms in the dying group. These GO clusters were primarily immune-related, including genes regulating the inflammasome complex and toll-like receptor signalling pathways. Specifically, GO terms related to cell death, including apoptosis, pyroptosis, and autophagy, and those related to fibrosis, coagulation dysfunction, and hepatic encephalopathy were enriched. Seven Kyoto Encyclopedia of Genes and Genomes (KEGG pathways, cytokine-cytokine receptor interaction, hematopoietic cell lineage, lysosome, rheumatoid arthritis, malaria, and phagosome and pertussis pathways were mapped for DEGs in the failure group. All of these seven KEGG pathways were involved in the 19 KEGG pathways mapped in the dying group. Conclusion. We found that the dramatic PBMC transcriptome changes triggered by ALF progression was predominantly related to immune responses. The enriched GO terms related to cell death, fibrosis, and so on, as indicated by PBMC transcriptome analysis, seem to be useful in elucidating potential key gene sets in the progression of ALF. A better understanding of these gene sets might be of preventive or

  8. The Gene Ontology (GO) Cellular Component Ontology: integration with SAO (Subcellular Anatomy Ontology) and other recent developments

    Science.gov (United States)

    2013-01-01

    Background The Gene Ontology (GO) (http://www.geneontology.org/) contains a set of terms for describing the activity and actions of gene products across all kingdoms of life. Each of these activities is executed in a location within a cell or in the vicinity of a cell. In order to capture this context, the GO includes a sub-ontology called the Cellular Component (CC) ontology (GO-CCO). The primary use of this ontology is for GO annotation, but it has also been used for phenotype annotation, and for the annotation of images. Another ontology with similar scope to the GO-CCO is the Subcellular Anatomy Ontology (SAO), part of the Neuroscience Information Framework Standard (NIFSTD) suite of ontologies. The SAO also covers cell components, but in the domain of neuroscience. Description Recently, the GO-CCO was enriched in content and links to the Biological Process and Molecular Function branches of GO as well as to other ontologies. This was achieved in several ways. We carried out an amalgamation of SAO terms with GO-CCO ones; as a result, nearly 100 new neuroscience-related terms were added to the GO. The GO-CCO also contains relationships to GO Biological Process and Molecular Function terms, as well as connecting to external ontologies such as the Cell Ontology (CL). Terms representing protein complexes in the Protein Ontology (PRO) reference GO-CCO terms for their species-generic counterparts. GO-CCO terms can also be used to search a variety of databases. Conclusions In this publication we provide an overview of the GO-CCO, its overall design, and some recent extensions that make use of additional spatial information. One of the most recent developments of the GO-CCO was the merging in of the SAO, resulting in a single unified ontology designed to serve the needs of GO annotators as well as the specific needs of the neuroscience community. PMID:24093723

  9. Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.

    Science.gov (United States)

    Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian

    2018-02-23

    Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.

    Science.gov (United States)

    Agapito, Giuseppe; Milano, Marianna; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-01-01

    Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches.

  11. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  12. Integrating Ontological Knowledge and Textual Evidence in Estimating Gene and Gene Product Similarity

    Energy Technology Data Exchange (ETDEWEB)

    Sanfilippo, Antonio P.; Posse, Christian; Gopalan, Banu; Tratz, Stephen C.; Gregory, Michelle L.

    2006-06-08

    With the rising influence of the Gene On-tology, new approaches have emerged where the similarity between genes or gene products is obtained by comparing Gene Ontology code annotations associ-ated with them. So far, these approaches have solely relied on the knowledge en-coded in the Gene Ontology and the gene annotations associated with the Gene On-tology database. The goal of this paper is to demonstrate that improvements to these approaches can be obtained by integrating textual evidence extracted from relevant biomedical literature.

  13. Bayesian assignment of gene ontology terms to gene expression experiments.

    Science.gov (United States)

    Sykacek, P

    2012-09-15

    Gene expression assays allow for genome scale analyses of molecular biological mechanisms. State-of-the-art data analysis provides lists of involved genes, either by calculating significance levels of mRNA abundance or by Bayesian assessments of gene activity. A common problem of such approaches is the difficulty of interpreting the biological implication of the resulting gene lists. This lead to an increased interest in methods for inferring high-level biological information. A common approach for representing high level information is by inferring gene ontology (GO) terms which may be attributed to the expression data experiment. This article proposes a probabilistic model for GO term inference. Modelling assumes that gene annotations to GO terms are available and gene involvement in an experiment is represented by a posterior probabilities over gene-specific indicator variables. Such probability measures result from many Bayesian approaches for expression data analysis. The proposed model combines these indicator probabilities in a probabilistic fashion and provides a probabilistic GO term assignment as a result. Experiments on synthetic and microarray data suggest that advantages of the proposed probabilistic GO term inference over statistical test-based approaches are in particular evident for sparsely annotated GO terms and in situations of large uncertainty about gene activity. Provided that appropriate annotations exist, the proposed approach is easily applied to inferring other high level assignments like pathways. Source code under GPL license is available from the author. peter.sykacek@boku.ac.at.

  14. Bayesian assignment of gene ontology terms to gene expression experiments

    Science.gov (United States)

    Sykacek, P.

    2012-01-01

    Motivation: Gene expression assays allow for genome scale analyses of molecular biological mechanisms. State-of-the-art data analysis provides lists of involved genes, either by calculating significance levels of mRNA abundance or by Bayesian assessments of gene activity. A common problem of such approaches is the difficulty of interpreting the biological implication of the resulting gene lists. This lead to an increased interest in methods for inferring high-level biological information. A common approach for representing high level information is by inferring gene ontology (GO) terms which may be attributed to the expression data experiment. Results: This article proposes a probabilistic model for GO term inference. Modelling assumes that gene annotations to GO terms are available and gene involvement in an experiment is represented by a posterior probabilities over gene-specific indicator variables. Such probability measures result from many Bayesian approaches for expression data analysis. The proposed model combines these indicator probabilities in a probabilistic fashion and provides a probabilistic GO term assignment as a result. Experiments on synthetic and microarray data suggest that advantages of the proposed probabilistic GO term inference over statistical test-based approaches are in particular evident for sparsely annotated GO terms and in situations of large uncertainty about gene activity. Provided that appropriate annotations exist, the proposed approach is easily applied to inferring other high level assignments like pathways. Availability: Source code under GPL license is available from the author. Contact: peter.sykacek@boku.ac.at PMID:22962488

  15. Genetically based location from triploid populations and gene ontology of a 3.3-mb genome region linked to Alternaria brown spot resistance in citrus reveal clusters of resistance genes.

    Directory of Open Access Journals (Sweden)

    José Cuenca

    Full Text Available Genetic analysis of phenotypical traits and marker-trait association in polyploid species is generally considered as a challenge. In the present work, different approaches were combined taking advantage of the particular genetic structures of 2n gametes resulting from second division restitution (SDR to map a genome region linked to Alternaria brown spot (ABS resistance in triploid citrus progeny. ABS in citrus is a serious disease caused by the tangerine pathotype of the fungus Alternaria alternata. This pathogen produces ACT-toxin, which induces necrotic lesions on fruit and young leaves, defoliation and fruit drop in susceptible genotypes. It is a strong concern for triploid breeding programs aiming to produce seedless mandarin cultivars. The monolocus dominant inheritance of susceptibility, proposed on the basis of diploid population studies, was corroborated in triploid progeny. Bulk segregant analysis coupled with genome scan using a large set of genetically mapped SNP markers and targeted genetic mapping by half tetrad analysis, using SSR and SNP markers, allowed locating a 3.3 Mb genomic region linked to ABS resistance near the centromere of chromosome III. Clusters of resistance genes were identified by gene ontology analysis of this genomic region. Some of these genes are good candidates to control the dominant susceptibility to the ACT-toxin. SSR and SNP markers were developed for efficient early marker-assisted selection of ABS resistant hybrids.

  16. Genetically based location from triploid populations and gene ontology of a 3.3-mb genome region linked to Alternaria brown spot resistance in citrus reveal clusters of resistance genes.

    Science.gov (United States)

    Cuenca, José; Aleza, Pablo; Vicent, Antonio; Brunel, Dominique; Ollitrault, Patrick; Navarro, Luis

    2013-01-01

    Genetic analysis of phenotypical traits and marker-trait association in polyploid species is generally considered as a challenge. In the present work, different approaches were combined taking advantage of the particular genetic structures of 2n gametes resulting from second division restitution (SDR) to map a genome region linked to Alternaria brown spot (ABS) resistance in triploid citrus progeny. ABS in citrus is a serious disease caused by the tangerine pathotype of the fungus Alternaria alternata. This pathogen produces ACT-toxin, which induces necrotic lesions on fruit and young leaves, defoliation and fruit drop in susceptible genotypes. It is a strong concern for triploid breeding programs aiming to produce seedless mandarin cultivars. The monolocus dominant inheritance of susceptibility, proposed on the basis of diploid population studies, was corroborated in triploid progeny. Bulk segregant analysis coupled with genome scan using a large set of genetically mapped SNP markers and targeted genetic mapping by half tetrad analysis, using SSR and SNP markers, allowed locating a 3.3 Mb genomic region linked to ABS resistance near the centromere of chromosome III. Clusters of resistance genes were identified by gene ontology analysis of this genomic region. Some of these genes are good candidates to control the dominant susceptibility to the ACT-toxin. SSR and SNP markers were developed for efficient early marker-assisted selection of ABS resistant hybrids.

  17. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  18. Prediction of human protein function according to Gene Ontology categories

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Gupta, Ramneek; Stærfeldt, Hans Henrik

    2003-01-01

    developed a method for prediction of protein function for a subset of classes from the Gene Ontology classification scheme. This subset includes several pharmaceutically interesting categories-transcription factors, receptors, ion channels, stress and immune response proteins, hormones and growth factors...

  19. Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology

    Science.gov (United States)

    2013-01-01

    Background The Gene Ontology (GO) facilitates the description of the action of gene products in a biological context. Many GO terms refer to chemical entities that participate in biological processes. To facilitate accurate and consistent systems-wide biological representation, it is necessary to integrate the chemical view of these entities with the biological view of GO functions and processes. We describe a collaborative effort between the GO and the Chemical Entities of Biological Interest (ChEBI) ontology developers to ensure that the representation of chemicals in the GO is both internally consistent and in alignment with the chemical expertise captured in ChEBI. Results We have examined and integrated the ChEBI structural hierarchy into the GO resource through computationally-assisted manual curation of both GO and ChEBI. Our work has resulted in the creation of computable definitions of GO terms that contain fully defined semantic relationships to corresponding chemical terms in ChEBI. Conclusions The set of logical definitions using both the GO and ChEBI has already been used to automate aspects of GO development and has the potential to allow the integration of data across the domains of biology and chemistry. These logical definitions are available as an extended version of the ontology from http://purl.obolibrary.org/obo/go/extensions/go-plus.owl. PMID:23895341

  20. Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning

    Directory of Open Access Journals (Sweden)

    Tsatsoulis Costas

    2010-05-01

    Full Text Available Abstract Background There is increasing evidence that gene location and surrounding genes influence the functionality of genes in the eukaryotic genome. Knowing the Gene Ontology Slim terms associated with a gene gives us insight into a gene's functionality by informing us how its gene product behaves in a cellular context using three different ontologies: molecular function, biological process, and cellular component. In this study, we analyzed if we could classify a gene in Saccharomyces cerevisiae to its correct Gene Ontology Slim term using information about its location in the genome and information from its nearest-neighbouring genes using classification learning. Results We performed experiments to establish that the MultiBoostAB algorithm using the J48 classifier could correctly classify Gene Ontology Slim terms of a gene given information regarding the gene's location and information from its nearest-neighbouring genes for training. Different neighbourhood sizes were examined to determine how many nearest neighbours should be included around each gene to provide better classification rules. Our results show that by just incorporating neighbour information from each gene's two-nearest neighbours, the percentage of correctly classified genes to their correct Gene Ontology Slim term for each ontology reaches over 80% with high accuracy (reflected in F-measures over 0.80 of the classification rules produced. Conclusions We confirmed that in classifying genes to their correct Gene Ontology Slim term, the inclusion of neighbour information from those genes is beneficial. Knowing the location of a gene and the Gene Ontology Slim information from neighbouring genes gives us insight into that gene's functionality. This benefit is seen by just including information from a gene's two-nearest neighbouring genes.

  1. Approaching the axiomatic enrichment of the Gene Ontology from a lexical perspective.

    Science.gov (United States)

    Quesada-Martínez, Manuel; Mikroyannidi, Eleni; Fernández-Breis, Jesualdo Tomás; Stevens, Robert

    2015-09-01

    The main goal of this work is to measure how lexical regularities in biomedical ontology labels can be used for the automatic creation of formal relationships between classes, and to evaluate the results of applying our approach to the Gene Ontology (GO). In recent years, we have developed a method for the lexical analysis of regularities in biomedical ontology labels, and we showed that the labels can present a high degree of regularity. In this work, we extend our method with a cross-products extension (CPE) metric, which estimates the potential interest of a specific regularity for axiomatic enrichment in the lexical analysis, using information on exact matches in external ontologies. The GO consortium recently enriched the GO by using so-called cross-product extensions. Cross-products are generated by establishing axioms that relate a given GO class with classes from the GO or other biomedical ontologies. We apply our method to the GO and study how its lexical analysis can identify and reconstruct the cross-products that are defined by the GO consortium. The label of the classes of the GO are highly regular in lexical terms, and the exact matches with labels of external ontologies affect 80% of the GO classes. The CPE metric reveals that 31.48% of the classes that exhibit regularities have fragments that are classes into two external ontologies that are selected for our experiment, namely, the Cell Ontology and the Chemical Entities of Biological Interest ontology, and 18.90% of them are fully decomposable into smaller parts. Our results show that the CPE metric permits our method to detect GO cross-product extensions with a mean recall of 62% and a mean precision of 28%. The study is completed with an analysis of false positives to explain this precision value. We think that our results support the claim that our lexical approach can contribute to the axiomatic enrichment of biomedical ontologies and that it can provide new insights into the engineering of

  2. Protein Annotation from Protein Interaction Networks and Gene Ontology

    OpenAIRE

    Nguyen, Cao D.; Gardiner, Katheleen J.; Cios, Krzysztof J.

    2011-01-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precis...

  3. [Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

    Science.gov (United States)

    Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

    2012-07-01

    In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.

  4. GOseek: a gene ontology search engine using enhanced keywords.

    Science.gov (United States)

    Taha, Kamal

    2013-01-01

    We propose in this paper a biological search engine called GOseek, which overcomes the limitation of current gene similarity tools. Given a set of genes, GOseek returns the most significant genes that are semantically related to the given genes. These returned genes are usually annotated to one of the Lowest Common Ancestors (LCA) of the Gene Ontology (GO) terms annotating the given genes. Most genes have several annotation GO terms. Therefore, there may be more than one LCA for the GO terms annotating the given genes. The LCA annotating the genes that are most semantically related to the given gene is the one that receives the most aggregate semantic contribution from the GO terms annotating the given genes. To identify this LCA, GOseek quantifies the contribution of the GO terms annotating the given genes to the semantics of their LCAs. That is, it encodes the semantic contribution into a numeric format. GOseek uses microarray experiment data to rank result genes based on their significance. We evaluated GOseek experimentally and compared it with a comparable gene prediction tool. Results showed marked improvement over the tool.

  5. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  6. Gene ontology based transfer learning for protein subcellular localization

    Directory of Open Access Journals (Sweden)

    Zhou Shuigeng

    2011-02-01

    Full Text Available Abstract Background Prediction of protein subcellular localization generally involves many complex factors, and using only one or two aspects of data information may not tell the true story. For this reason, some recent predictive models are deliberately designed to integrate multiple heterogeneous data sources for exploiting multi-aspect protein feature information. Gene ontology, hereinafter referred to as GO, uses a controlled vocabulary to depict biological molecules or gene products in terms of biological process, molecular function and cellular component. With the rapid expansion of annotated protein sequences, gene ontology has become a general protein feature that can be used to construct predictive models in computational biology. Existing models generally either concatenated the GO terms into a flat binary vector or applied majority-vote based ensemble learning for protein subcellular localization, both of which can not estimate the individual discriminative abilities of the three aspects of gene ontology. Results In this paper, we propose a Gene Ontology Based Transfer Learning Model (GO-TLM for large-scale protein subcellular localization. The model transfers the signature-based homologous GO terms to the target proteins, and further constructs a reliable learning system to reduce the adverse affect of the potential false GO terms that are resulted from evolutionary divergence. We derive three GO kernels from the three aspects of gene ontology to measure the GO similarity of two proteins, and derive two other spectrum kernels to measure the similarity of two protein sequences. We use simple non-parametric cross validation to explicitly weigh the discriminative abilities of the five kernels, such that the time & space computational complexities are greatly reduced when compared to the complicated semi-definite programming and semi-indefinite linear programming. The five kernels are then linearly merged into one single kernel for

  7. Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products

    Directory of Open Access Journals (Sweden)

    Mingxin Gan

    2014-01-01

    Full Text Available Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may significantly overestimate semantic similarity between genes that are actually not functionally related, thereby yielding misleading results in applications. To overcome this limitation, we propose to represent a gene product as a vector that is composed of information contents of gene ontology terms annotated for the gene product, and we suggest calculating similarity between two gene products as the relatedness of their corresponding vectors using three measures: Pearson’s correlation coefficient, cosine similarity, and the Jaccard index. We focus on the biological process domain of the gene ontology and annotations of yeast proteins to study the effectiveness of the proposed measures. Results show that semantic similarity scores calculated using the proposed measures are more consistent with known biological knowledge than those derived using a list of existing methods, suggesting the effectiveness of our method in characterizing functional relationships between gene products.

  8. Protein annotation from protein interaction networks and Gene Ontology.

    Science.gov (United States)

    Nguyen, Cao D; Gardiner, Katheleen J; Cios, Krzysztof J

    2011-10-01

    We introduce a novel method for annotating protein function that combines Naïve Bayes and association rules, and takes advantage of the underlying topology in protein interaction networks and the structure of graphs in the Gene Ontology. We apply our method to proteins from the Human Protein Reference Database (HPRD) and show that, in comparison with other approaches, it predicts protein functions with significantly higher recall with no loss of precision. Specifically, it achieves 51% precision and 60% recall versus 45% and 26% for Majority and 24% and 61% for χ²-statistics, respectively. Copyright © 2011 Elsevier Inc. All rights reserved.

  9. Text Mining to Support Gene Ontology Curation and Vice Versa.

    Science.gov (United States)

    Ruch, Patrick

    2017-01-01

    In this chapter, we explain how text mining can support the curation of molecular biology databases dealing with protein functions. We also show how curated data can play a disruptive role in the developments of text mining methods. We review a decade of efforts to improve the automatic assignment of Gene Ontology (GO) descriptors, the reference ontology for the characterization of genes and gene products. To illustrate the high potential of this approach, we compare the performances of an automatic text categorizer and show a large improvement of +225 % in both precision and recall on benchmarked data. We argue that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions. Because GO descriptors can be relatively long and specific, traditional QA systems cannot answer such questions. A new type of QA system, so-called Deep QA which uses machine learning methods trained with curated contents, is thus emerging. Finally, future advances of text mining instruments are directly dependent on the availability of high-quality annotated contents at every curation step. Databases workflows must start recording explicitly all the data they curate and ideally also some of the data they do not curate.

  10. Determining the semantic similarities among Gene Ontology terms.

    Science.gov (United States)

    Taha, Kamal

    2013-05-01

    We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.

  11. GOPET: A tool for automated predictions of Gene Ontology terms

    Directory of Open Access Journals (Sweden)

    Glatting Karl-Heinz

    2006-03-01

    Full Text Available Abstract Background Vast progress in sequencing projects has called for annotation on a large scale. A Number of methods have been developed to address this challenging task. These methods, however, either apply to specific subsets, or their predictions are not formalised, or they do not provide precise confidence values for their predictions. Description We recently established a learning system for automated annotation, trained with a broad variety of different organisms to predict the standardised annotation terms from Gene Ontology (GO. Now, this method has been made available to the public via our web-service GOPET (Gene Ontology term Prediction and Evaluation Tool. It supplies annotation for sequences of any organism. For each predicted term an appropriate confidence value is provided. The basic method had been developed for predicting molecular function GO-terms. It is now expanded to predict biological process terms. This web service is available via http://genius.embnet.dkfz-heidelberg.de/menu/biounit/open-husar Conclusion Our web service gives experimental researchers as well as the bioinformatics community a valuable sequence annotation device. Additionally, GOPET also provides less significant annotation data which may serve as an extended discovery platform for the user.

  12. The representation of heart development in the gene ontology.

    Science.gov (United States)

    Khodiyar, Varsha K; Hill, David P; Howe, Doug; Berardini, Tanya Z; Tweedie, Susan; Talmud, Philippa J; Breckenridge, Ross; Bhattarcharya, Shoumo; Riley, Paul; Scambler, Peter; Lovering, Ruth C

    2011-06-01

    An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling. In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development. This work also aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject. The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area. Copyright © 2011

  13. The Representation of Heart Development in the Gene Ontology

    Science.gov (United States)

    Khodiyar, Varsha K.; Hill, David P.; Howe, Doug; Berardini, Tanya Z.; Tweedie, Susan; Talmud, Philippa J.; Breckenridge, Ross; Bhattarcharya, Shoumo; Riley, Paul; Scambler, Peter; Lovering, Ruth C.

    2012-01-01

    An understanding of heart development is critical in any systems biology approach to cardiovascular disease. The interpretation of data generated from high-throughput technologies (such as microarray and proteomics) is also essential to this approach. However, characterizing the role of genes in the processes underlying heart development and cardiovascular disease involves the non-trivial task of data analysis and integration of previous knowledge. The Gene Ontology (GO) Consortium provides structured controlled biological vocabularies that are used to summarize previous functional knowledge for gene products across all species. One aspect of GO describes biological processes, such as development and signaling. In order to support high-throughput cardiovascular research, we have initiated an effort to fully describe heart development in GO; expanding the number of GO terms describing heart development from 12 to over 280. This new ontology describes heart morphogenesis, the differentiation of specific cardiac cell types, and the involvement of signaling pathways in heart development and aligns GO with the current views of the heart development research community and its representation in the literature. This extension of GO allows gene product annotators to comprehensively capture the genetic program leading to the developmental progression of the heart. This will enable users to integrate heart development data across species, resulting in the comprehensive retrieval of information about this subject. The revised GO structure, combined with gene product annotations, should improve the interpretation of data from high-throughput methods in a variety of cardiovascular research areas, including heart development, congenital cardiac disease, and cardiac stem cell research. Additionally, we invite the heart development community to contribute to the expansion of this important dataset for the benefit of future research in this area. PMID:21419760

  14. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    Science.gov (United States)

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Improving Interpretation of Cardiac Phenotypes and Enhancing Discovery With Expanded Knowledge in the Gene Ontology.

    Science.gov (United States)

    Lovering, Ruth C; Roncaglia, Paola; Howe, Douglas G; Laulederkind, Stanley J F; Khodiyar, Varsha K; Berardini, Tanya Z; Tweedie, Susan; Foulger, Rebecca E; Osumi-Sutherland, David; Campbell, Nancy H; Huntley, Rachael P; Talmud, Philippa J; Blake, Judith A; Breckenridge, Ross; Riley, Paul R; Lambiase, Pier D; Elliott, Perry M; Clapp, Lucie; Tinker, Andrew; Hill, David P

    2018-02-01

    A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. © 2018 The Authors.

  16. Codon bias and gene ontology in holometabolous and hemimetabolous insects.

    Science.gov (United States)

    Carlini, David B; Makowski, Matthew

    2015-12-01

    The relationship between preferred codon use (PCU), developmental mode, and gene ontology (GO) was investigated in a sample of nine insect species with sequenced genomes. These species were selected to represent two distinct modes of insect development, holometabolism and hemimetabolism, with an aim toward determining whether the differences in developmental timing concomitant with developmental mode would be mirrored by differences in PCU in their developmental genes. We hypothesized that the developmental genes of holometabolous insects should be under greater selective pressure for efficient translation, manifest as increased PCU, than those of hemimetabolous insects because holometabolism requires abundant protein expression over shorter time intervals than hemimetabolism, where proteins are required more uniformly in time. Preferred codon sets were defined for each species, from which the frequency of PCU for each gene was obtained. Although there were substantial differences in the genomic base composition of holometabolous and hemimetabolous insects, both groups exhibited a general preference for GC-ending codons, with the former group having higher PCU averaged across all genes. For each species, the biological process GO term for each gene was assigned that of its Drosophila homolog(s), and PCU was calculated for each GO term category. The top two GO term categories for PCU enrichment in the holometabolous insects were anatomical structure development and cell differentiation. The increased PCU in the developmental genes of holometabolous insects may reflect a general strategy to maximize the protein production of genes expressed in bursts over short time periods, e.g., heat shock proteins. J. Exp. Zool. (Mol. Dev. Evol.) 324B: 686-698, 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  17. Networks in biological systems: An investigation of the Gene Ontology as an evolving network

    International Nuclear Information System (INIS)

    Coronnello, C; Tumminello, M; Micciche, S; Mantegna, R.N.

    2009-01-01

    Many biological systems can be described as networks where different elements interact, in order to perform biological processes. We introduce a network associated with the Gene Ontology. Specifically, we construct a correlation-based network where the vertices are the terms of the Gene Ontology and the link between each two terms is weighted on the basis of the number of genes that they have in common. We analyze a filtered network obtained from the correlation-based network and we characterize its evolution over different releases of the Gene Ontology.

  18. Representing virus-host interactions and other multi-organism processes in the Gene Ontology.

    Science.gov (United States)

    Foulger, R E; Osumi-Sutherland, D; McIntosh, B K; Hulo, C; Masson, P; Poux, S; Le Mercier, P; Lomax, J

    2015-07-28

    The Gene Ontology project is a collaborative effort to provide descriptions of gene products in a consistent and computable language, and in a species-independent manner. The Gene Ontology is designed to be applicable to all organisms but up to now has been largely under-utilized for prokaryotes and viruses, in part because of a lack of appropriate ontology terms. To address this issue, we have developed a set of Gene Ontology classes that are applicable to microbes and their hosts, improving both coverage and quality in this area of the Gene Ontology. Describing microbial and viral gene products brings with it the additional challenge of capturing both the host and the microbe. Recognising this, we have worked closely with annotation groups to test and optimize the GO classes, and we describe here a set of annotation guidelines that allow the controlled description of two interacting organisms. Building on the microbial resources already in existence such as ViralZone, UniProtKB keywords and MeGO, this project provides an integrated ontology to describe interactions between microbial species and their hosts, with mappings to the external resources above. Housing this information within the freely-accessible Gene Ontology project allows the classes and annotation structure to be utilized by a large community of biologists and users.

  19. OAHG: an integrated resource for annotating human genes with multi-level ontologies.

    Science.gov (United States)

    Cheng, Liang; Sun, Jie; Xu, Wanying; Dong, Lixiang; Hu, Yang; Zhou, Meng

    2016-10-05

    OAHG, an integrated resource, aims to establish a comprehensive functional annotation resource for human protein-coding genes (PCGs), miRNAs, and lncRNAs by multi-level ontologies involving Gene Ontology (GO), Disease Ontology (DO), and Human Phenotype Ontology (HPO). Many previous studies have focused on inferring putative properties and biological functions of PCGs and non-coding RNA genes from different perspectives. During the past several decades, a few of databases have been designed to annotate the functions of PCGs, miRNAs, and lncRNAs, respectively. A part of functional descriptions in these databases were mapped to standardize terminologies, such as GO, which could be helpful to do further analysis. Despite these developments, there is no comprehensive resource recording the function of these three important types of genes. The current version of OAHG, release 1.0 (Jun 2016), integrates three ontologies involving GO, DO, and HPO, six gene functional databases and two interaction databases. Currently, OAHG contains 1,434,694 entries involving 16,929 PCGs, 637 miRNAs, 193 lncRNAs, and 24,894 terms of ontologies. During the performance evaluation, OAHG shows the consistencies with existing gene interactions and the structure of ontology. For example, terms with more similar structure could be associated with more associated genes (Pearson correlation γ 2  = 0.2428, p < 2.2e-16).

  20. Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae

    Directory of Open Access Journals (Sweden)

    Deng Jixin

    2009-02-01

    Full Text Available Abstract Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO. In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57% being annotated with 1,957 distinct and specific GO terms. Unannotated proteins

  1. A methodology to migrate the gene ontology to a description logic environment using DAML+OIL.

    Science.gov (United States)

    Wroe, C J; Stevens, R; Goble, C A; Ashburner, M

    2003-01-01

    The Gene Ontology Next Generation Project (GONG) is developing a staged methodology to evolve the current representation of the Gene Ontology into DAML+OIL in order to take advantage of the richer formal expressiveness and the reasoning capabilities of the underlying description logic. Each stage provides a step level increase in formal explicit semantic content with a view to supporting validation, extension and multiple classification of the Gene Ontology. The paper introduces DAML+OIL and demonstrates the activity within each stage of the methodology and the functionality gained.

  2. Evaluating Functional Annotations of Enzymes Using the Gene Ontology.

    Science.gov (United States)

    Holliday, Gemma L; Davidson, Rebecca; Akiva, Eyal; Babbitt, Patricia C

    2017-01-01

    The Gene Ontology (GO) (Ashburner et al., Nat Genet 25(1):25-29, 2000) is a powerful tool in the informatics arsenal of methods for evaluating annotations in a protein dataset. From identifying the nearest well annotated homologue of a protein of interest to predicting where misannotation has occurred to knowing how confident you can be in the annotations assigned to those proteins is critical. In this chapter we explore what makes an enzyme unique and how we can use GO to infer aspects of protein function based on sequence similarity. These can range from identification of misannotation or other errors in a predicted function to accurate function prediction for an enzyme of entirely unknown function. Although GO annotation applies to any gene products, we focus here a describing our approach for hierarchical classification of enzymes in the Structure-Function Linkage Database (SFLD) (Akiva et al., Nucleic Acids Res 42(Database issue):D521-530, 2014) as a guide for informed utilisation of annotation transfer based on GO terms.

  3. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    Science.gov (United States)

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  4. The prediction of candidate genes for cervix related cancer through gene ontology and graph theoretical approach.

    Science.gov (United States)

    Hindumathi, V; Kranthi, T; Rao, S B; Manimaran, P

    2014-06-01

    With rapidly changing technology, prediction of candidate genes has become an indispensable task in recent years mainly in the field of biological research. The empirical methods for candidate gene prioritization that succors to explore the potential pathway between genetic determinants and complex diseases are highly cumbersome and labor intensive. In such a scenario predicting potential targets for a disease state through in silico approaches are of researcher's interest. The prodigious availability of protein interaction data coupled with gene annotation renders an ease in the accurate determination of disease specific candidate genes. In our work we have prioritized the cervix related cancer candidate genes by employing Csaba Ortutay and his co-workers approach of identifying the candidate genes through graph theoretical centrality measures and gene ontology. With the advantage of the human protein interaction data, cervical cancer gene sets and the ontological terms, we were able to predict 15 novel candidates for cervical carcinogenesis. The disease relevance of the anticipated candidate genes was corroborated through a literature survey. Also the presence of the drugs for these candidates was detected through Therapeutic Target Database (TTD) and DrugMap Central (DMC) which affirms that they may be endowed as potential drug targets for cervical cancer.

  5. Multi-label literature classification based on the Gene Ontology graph

    Directory of Open Access Journals (Sweden)

    Lu Xinghua

    2008-12-01

    Full Text Available Abstract Background The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. Results In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Conclusion Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate

  6. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    Science.gov (United States)

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  7. Protein-Protein Interaction Network and Gene Ontology

    Science.gov (United States)

    Choi, Yunkyu; Kim, Seok; Yi, Gwan-Su; Park, Jinah

    Evolution of computer technologies makes it possible to access a large amount and various kinds of biological data via internet such as DNA sequences, proteomics data and information discovered about them. It is expected that the combination of various data could help researchers find further knowledge about them. Roles of a visualization system are to invoke human abilities to integrate information and to recognize certain patterns in the data. Thus, when the various kinds of data are examined and analyzed manually, an effective visualization system is an essential part. One instance of these integrated visualizations can be combination of protein-protein interaction (PPI) data and Gene Ontology (GO) which could help enhance the analysis of PPI network. We introduce a simple but comprehensive visualization system that integrates GO and PPI data where GO and PPI graphs are visualized side-by-side and supports quick reference functions between them. Furthermore, the proposed system provides several interactive visualization methods for efficiently analyzing the PPI network and GO directedacyclic- graph such as context-based browsing and common ancestors finding.

  8. Gene Ontology and KEGG Enrichment Analyses of Genes Related to Age-Related Macular Degeneration

    Directory of Open Access Journals (Sweden)

    Jian Zhang

    2014-01-01

    Full Text Available Identifying disease genes is one of the most important topics in biomedicine and may facilitate studies on the mechanisms underlying disease. Age-related macular degeneration (AMD is a serious eye disease; it typically affects older adults and results in a loss of vision due to retina damage. In this study, we attempt to develop an effective method for distinguishing AMD-related genes. Gene ontology and KEGG enrichment analyses of known AMD-related genes were performed, and a classification system was established. In detail, each gene was encoded into a vector by extracting enrichment scores of the gene set, including it and its direct neighbors in STRING, and gene ontology terms or KEGG pathways. Then certain feature-selection methods, including minimum redundancy maximum relevance and incremental feature selection, were adopted to extract key features for the classification system. As a result, 720 GO terms and 11 KEGG pathways were deemed the most important factors for predicting AMD-related genes.

  9. Automatic annotation of protein motif function with Gene Ontology terms

    Directory of Open Access Journals (Sweden)

    Gopalakrishnan Vanathi

    2004-09-01

    Full Text Available Abstract Background Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. Results This paperpresents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifsis viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association isfound to be a very useful feature. We take advantageof the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correctassociation. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. Conclusions In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about thefunctions of newly discovered candidate protein motifs.

  10. A robust data-driven approach for gene ontology annotation.

    Science.gov (United States)

    Li, Yanpeng; Yu, Hong

    2014-01-01

    Gene ontology (GO) and GO annotation are important resources for biological information management and knowledge discovery, but the speed of manual annotation became a major bottleneck of database curation. BioCreative IV GO annotation task aims to evaluate the performance of system that automatically assigns GO terms to genes based on the narrative sentences in biomedical literature. This article presents our work in this task as well as the experimental results after the competition. For the evidence sentence extraction subtask, we built a binary classifier to identify evidence sentences using reference distance estimator (RDE), a recently proposed semi-supervised learning method that learns new features from around 10 million unlabeled sentences, achieving an F1 of 19.3% in exact match and 32.5% in relaxed match. In the post-submission experiment, we obtained 22.1% and 35.7% F1 performance by incorporating bigram features in RDE learning. In both development and test sets, RDE-based method achieved over 20% relative improvement on F1 and AUC performance against classical supervised learning methods, e.g. support vector machine and logistic regression. For the GO term prediction subtask, we developed an information retrieval-based method to retrieve the GO term most relevant to each evidence sentence using a ranking function that combined cosine similarity and the frequency of GO terms in documents, and a filtering method based on high-level GO classes. The best performance of our submitted runs was 7.8% F1 and 22.2% hierarchy F1. We found that the incorporation of frequency information and hierarchy filtering substantially improved the performance. In the post-submission evaluation, we obtained a 10.6% F1 using a simpler setting. Overall, the experimental analysis showed our approaches were robust in both the two tasks. © The Author(s) 2014. Published by Oxford University Press.

  11. Length bias correction in gene ontology enrichment analysis using logistic regression.

    Science.gov (United States)

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  12. Gene dosage, expression, and ontology analysis identifies driver genes in the carcinogenesis and chemoradioresistance of cervical cancer.

    Directory of Open Access Journals (Sweden)

    Malin Lando

    2009-11-01

    Full Text Available Integrative analysis of gene dosage, expression, and ontology (GO data was performed to discover driver genes in the carcinogenesis and chemoradioresistance of cervical cancers. Gene dosage and expression profiles of 102 locally advanced cervical cancers were generated by microarray techniques. Fifty-two of these patients were also analyzed with the Illumina expression method to confirm the gene expression results. An independent cohort of 41 patients was used for validation of gene expressions associated with clinical outcome. Statistical analysis identified 29 recurrent gains and losses and 3 losses (on 3p, 13q, 21q associated with poor outcome after chemoradiotherapy. The intratumor heterogeneity, assessed from the gene dosage profiles, was low for these alterations, showing that they had emerged prior to many other alterations and probably were early events in carcinogenesis. Integration of the alterations with gene expression and GO data identified genes that were regulated by the alterations and revealed five biological processes that were significantly overrepresented among the affected genes: apoptosis, metabolism, macromolecule localization, translation, and transcription. Four genes on 3p (RYBP, GBE1 and 13q (FAM48A, MED4 correlated with outcome at both the gene dosage and expression level and were satisfactorily validated in the independent cohort. These integrated analyses yielded 57 candidate drivers of 24 genetic events, including novel loci responsible for chemoradioresistance. Further mapping of the connections among genetic events, drivers, and biological processes suggested that each individual event stimulates specific processes in carcinogenesis through the coordinated control of multiple genes. The present results may provide novel therapeutic opportunities of both early and advanced stage cervical cancers.

  13. Aspergillus flavus Blast2GO gene ontology database: elevated growth temperature alters amino acid metabolism

    Science.gov (United States)

    The availability of a representative gene ontology (GO) database is a prerequisite for a successful functional genomics study. Using online Blast2GO resources we constructed a GO database of Aspergillus flavus. Of the predicted total 13,485 A. flavus genes 8,987 were annotated with GO terms. The mea...

  14. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    Science.gov (United States)

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  15. Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2015-01-01

    Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords. In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature. INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these

  16. A new measure for functional similarity of gene products based on Gene Ontology

    Directory of Open Access Journals (Sweden)

    Lengauer Thomas

    2006-06-01

    Full Text Available Abstract Background Gene Ontology (GO is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. Results We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; simRel and funSim. One measure (simRel is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. Conclusion The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families.

  17. GO(vis), a gene ontology visualization tool based on multi-dimensional values.

    Science.gov (United States)

    Ning, Zi; Jiang, Zhenran

    2010-05-01

    Most of gene product similarity measurements concentrate on the information content of Gene Ontology (GO) terms or use a path-based similarity between GO terms, which may ignore other important information contained in the structure of the ontology. In our study, we integrate different GO similarity measure approaches to analyze the functional relationship of genes and gene products with a new triangle-based visualization tool called GO(Vis). The purpose of this tool is to demonstrate the effect of three important information factors when measuring the similarity between gene products. One advantage of this tool is that its important ratio can be adjusted to meet different measuring requirements according to the biological knowledge of each factor. The experimental results demonstrate that GO(Vis) can display diagrams of the functional relationship for gene products effectively.

  18. A multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors for functional gene analysis.

    Science.gov (United States)

    Weber, Kristoffer; Bartsch, Udo; Stocking, Carol; Fehse, Boris

    2008-04-01

    Functional gene analysis requires the possibility of overexpression, as well as downregulation of one, or ideally several, potentially interacting genes. Lentiviral vectors are well suited for this purpose as they ensure stable expression of complementary DNAs (cDNAs), as well as short-hairpin RNAs (shRNAs), and can efficiently transduce a wide spectrum of cell targets when packaged within the coat proteins of other viruses. Here we introduce a multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors designed according to the "building blocks" principle. Using a wide spectrum of different fluorescent markers, including drug-selectable enhanced green fluorescent protein (eGFP)- and dTomato-blasticidin-S resistance fusion proteins, LeGO vectors allow simultaneous analysis of multiple genes and shRNAs of interest within single, easily identifiable cells. Furthermore, each functional module is flanked by unique cloning sites, ensuring flexibility and individual optimization. The efficacy of these vectors for analyzing multiple genes in a single cell was demonstrated in several different cell types, including hematopoietic, endothelial, and neural stem and progenitor cells, as well as hepatocytes. LeGO vectors thus represent a valuable tool for investigating gene networks using conditional ectopic expression and knock-down approaches simultaneously.

  19. Muscle Research and Gene Ontology: New standards for improved data integration.

    Science.gov (United States)

    Feltrin, Erika; Campanaro, Stefano; Diehl, Alexander D; Ehler, Elisabeth; Faulkner, Georgine; Fordham, Jennifer; Gardin, Chiara; Harris, Midori; Hill, David; Knoell, Ralph; Laveder, Paolo; Mittempergher, Lorenza; Nori, Alessandra; Reggiani, Carlo; Sorrentino, Vincenzo; Volpe, Pompeo; Zara, Ivano; Valle, Giorgio; Deegan, Jennifer

    2009-01-29

    The Gene Ontology Project provides structured controlled vocabularies for molecular biology that can be used for the functional annotation of genes and gene products. In a collaboration between the Gene Ontology (GO) Consortium and the muscle biology community, we have made large-scale additions to the GO biological process and cellular component ontologies. The main focus of this ontology development work concerns skeletal muscle, with specific consideration given to the processes of muscle contraction, plasticity, development, and regeneration, and to the sarcomere and membrane-delimited compartments. Our aims were to update the existing structure to reflect current knowledge, and to resolve, in an accommodating manner, the ambiguity in the language used by the community. The updated muscle terminologies have been incorporated into the GO. There are now 159 new terms covering critical research areas, and 57 existing terms have been improved and reorganized to follow their usage in muscle literature. The revised GO structure should improve the interpretation of data from high-throughput (e.g. microarray and proteomic) experiments in the area of muscle science and muscle disease. We actively encourage community feedback on, and gene product annotation with these new terms. Please visit the Muscle Community Annotation Wiki http://wiki.geneontology.org/index.php/Muscle_Biology.

  20. Muscle Research and Gene Ontology: New standards for improved data integration

    Directory of Open Access Journals (Sweden)

    Nori Alessandra

    2009-01-01

    Full Text Available Abstract Background The Gene Ontology Project provides structured controlled vocabularies for molecular biology that can be used for the functional annotation of genes and gene products. In a collaboration between the Gene Ontology (GO Consortium and the muscle biology community, we have made large-scale additions to the GO biological process and cellular component ontologies. The main focus of this ontology development work concerns skeletal muscle, with specific consideration given to the processes of muscle contraction, plasticity, development, and regeneration, and to the sarcomere and membrane-delimited compartments. Our aims were to update the existing structure to reflect current knowledge, and to resolve, in an accommodating manner, the ambiguity in the language used by the community. Results The updated muscle terminologies have been incorporated into the GO. There are now 159 new terms covering critical research areas, and 57 existing terms have been improved and reorganized to follow their usage in muscle literature. Conclusion The revised GO structure should improve the interpretation of data from high-throughput (e.g. microarray and proteomic experiments in the area of muscle science and muscle disease. We actively encourage community feedback on, and gene product annotation with these new terms. Please visit the Muscle Community Annotation Wiki http://wiki.geneontology.org/index.php/Muscle_Biology.

  1. Reveal genes functionally associated with ACADS by a network study.

    Science.gov (United States)

    Chen, Yulong; Su, Zhiguang

    2015-09-15

    Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

    Science.gov (United States)

    Hur, Junguk; Özgür, Arzucan; He, Yongqun

    2017-03-14

    Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of

  3. The mammalian adult neurogenesis gene ontology (MANGO provides a structural framework for published information on genes regulating adult hippocampal neurogenesis.

    Directory of Open Access Journals (Sweden)

    Rupert W Overall

    Full Text Available BACKGROUND: Adult hippocampal neurogenesis is not a single phenotype, but consists of a number of sub-processes, each of which is under complex genetic control. Interpretation of gene expression studies using existing resources often does not lead to results that address the interrelatedness of these processes. Formal structure, such as provided by ontologies, is essential in any field for comprehensive interpretation of existing knowledge but, until now, such a structure has been lacking for adult neurogenesis. METHODOLOGY/PRINCIPAL FINDINGS: We have created a resource with three components 1. A structured ontology describing the key stages in the development of adult hippocampal neural stem cells into functional granule cell neurons. 2. A comprehensive survey of the literature to annotate the results of all published reports on gene function in adult hippocampal neurogenesis (257 manuscripts covering 228 genes to the appropriate terms in our ontology. 3. An easy-to-use searchable interface to the resulting database made freely available online. The manuscript presents an overview of the database highlighting global trends such as the current bias towards research on early proliferative stages, and an example gene set enrichment analysis. A limitation of the resource is the current scope of the literature which, however, is growing by around 100 publications per year. With the ontology and database in place, new findings can be rapidly annotated and regular updates of the database will be made publicly available. CONCLUSIONS/SIGNIFICANCE: The resource we present allows relevant interpretation of gene expression screens in terms of defined stages of postnatal neuronal development. Annotation of genes by hand from the adult neurogenesis literature ensures the data are directly applicable to the system under study. We believe this approach could also serve as an example to other fields in a 'bottom-up' community effort complementing the already

  4. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  5. GOssTo: a stand-alone application and a web tool for calculating semantic similarities on the Gene Ontology

    OpenAIRE

    Caniza, Horacio; Romero, Alfonso E.; Heron, Samuel; Yang, Haixuan; Devoto, Alessandra; Frasca, Marco; Mesiti, Marco; Valentini, Giorgio; Paccanaro, Alberto

    2014-01-01

    Summary: We present GOssTo, the Gene Ontology semantic similarity Tool, a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology. GOssTo is bundled with six semantic similarity measures, including both term- and graph-based measures, and has extension capabilities to allow the user to add new similarities. Importantly, for any measure, GOssTo can also calculate the Random Walk Contribution that has been shown to greatly improve...

  6. Gene Ontology Terms and Automated Annotation for Energy-Related Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Mukhopadhyay, Biswarup [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Tyler, Brett M. [Oregon State Univ., Corvallis, OR (United States); Setubal, Joao [Univ. of Sao Paulo (Brazil); Murali, T. M. [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)

    2017-11-03

    Gene Ontology (GO) is one of the more widely used functional ontologies for describing gene functions at various levels. The project developed 660 GO terms for describing energy-related microbial processes and filled the known gaps in this area of the GO system, and then used these terms to describe functions of 179 genes to showcase the utilities of the new resources. It hosted a series of workshops and made presentations at key meetings to inform and train scientific community members on these terms and to receive inputs from them for the GO term generation efforts. The project has developed a website for storing and displaying the resources (http://www.mengo.biochem.vt.edu/). The outcome of the project was further disseminated through peer-reviewed publications and poster and seminar presentations.

  7. Gene ontology analysis of pairwise genetic associations in two genome-wide studies of sporadic ALS

    Directory of Open Access Journals (Sweden)

    Kim Nora

    2012-07-01

    Full Text Available Abstract Background It is increasingly clear that common human diseases have a complex genetic architecture characterized by both additive and nonadditive genetic effects. The goal of the present study was to determine whether patterns of both additive and nonadditive genetic associations aggregate in specific functional groups as defined by the Gene Ontology (GO. Results We first estimated all pairwise additive and nonadditive genetic effects using the multifactor dimensionality reduction (MDR method that makes few assumptions about the underlying genetic model. Statistical significance was evaluated using permutation testing in two genome-wide association studies of ALS. The detection data consisted of 276 subjects with ALS and 271 healthy controls while the replication data consisted of 221 subjects with ALS and 211 healthy controls. Both studies included genotypes from approximately 550,000 single-nucleotide polymorphisms (SNPs. Each SNP was mapped to a gene if it was within 500 kb of the start or end. Each SNP was assigned a p-value based on its strongest joint effect with the other SNPs. We then used the Exploratory Visual Analysis (EVA method and software to assign a p-value to each gene based on the overabundance of significant SNPs at the α = 0.05 level in the gene. We also used EVA to assign p-values to each GO group based on the overabundance of significant genes at the α = 0.05 level. A GO category was determined to replicate if that category was significant at the α = 0.05 level in both studies. We found two GO categories that replicated in both studies. The first, ‘Regulation of Cellular Component Organization and Biogenesis’, a GO Biological Process, had p-values of 0.010 and 0.014 in the detection and replication studies, respectively. The second, ‘Actin Cytoskeleton’, a GO Cellular Component, had p-values of 0.040 and 0.046 in the detection and replication studies, respectively. Conclusions Pathway

  8. Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.

    Science.gov (United States)

    Zhang, Shu-Bo; Tang, Qiang-Rong

    2016-07-21

    Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology.

    Science.gov (United States)

    Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J; Fan-Minogue, Hua; Van Auken, Kimberly; Dumontier, Michel; Musen, Mark A

    2016-04-01

    Biomedical ontologies contain errors. Crowdsourcing, defined as taking a job traditionally performed by a designated agent and outsourcing it to an undefined large group of people, provides scalable access to humans. Therefore, the crowd has the potential to overcome the limited accuracy and scalability found in current ontology quality assurance approaches. Crowd-based methods have identified errors in SNOMED CT, a large, clinical ontology, with an accuracy similar to that of experts, suggesting that crowdsourcing is indeed a feasible approach for identifying ontology errors. This work uses that same crowd-based methodology, as well as a panel of experts, to verify a subset of the Gene Ontology (200 relationships). Experts identified 16 errors, generally in relationships referencing acids and metals. The crowd performed poorly in identifying those errors, with an area under the receiver operating characteristic curve ranging from 0.44 to 0.73, depending on the methods configuration. However, when the crowd verified what experts considered to be easy relationships with useful definitions, they performed reasonably well. Notably, there are significantly fewer Google search results for Gene Ontology concepts than SNOMED CT concepts. This disparity may account for the difference in performance - fewer search results indicate a more difficult task for the worker. The number of Internet search results could serve as a method to assess which tasks are appropriate for the crowd. These results suggest that the crowd fits better as an expert assistant, helping experts with their verification by completing the easy tasks and allowing experts to focus on the difficult tasks, rather than an expert replacement. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Genetic Resources for Advanced Biofuel Production Described with the Gene Ontology

    Directory of Open Access Journals (Sweden)

    Trudy eTorto-Alalibo

    2014-10-01

    Full Text Available Dramatic increases in research in the area of microbial biofuel production coupled with high-throughput data generation on bioenergy-related microbes has led to a deluge of information in the scientific literature and in databases. Consolidating this information and making it easily accessible requires a unified vocabulary. The Gene Ontology (GO fulfills that requirement, as it is a well-developed structured vocabulary that describes the activities and locations of gene products in a consistent manner across all kingdoms of life. The Microbial Energy Gene Ontology (MENGO: http://www.mengo.biochem.vt.edu project is extending the GO to include new terms to describe microbial processes of interest to bioenergy production. Our effort has added over 600 bioenergy related terms to the Gene Ontology. These terms will aid in the comprehensive annotation of gene products from diverse energy-related microbial genomes. An area of microbial energy research that has received a lot of attention is microbial production of advanced biofuels. These include alcohols such as butanol, isopropanol, isobutanol, and fuels derived from fatty acids, isoprenoids, and polyhydroxyalkanoates. These fuels are superior to first generation biofuels (ethanol and biodiesel esterified from vegetable oil or animal fat, can be generated from non-food feedstock sources, can be used as supplements or substitutes for gasoline, diesel and jet fuels, and can be stored and distributed using existing infrastructure. Here we review the roles of genes associated with synthesis of advanced biofuels, and at the same time introduce the use of the GO to describe the functions of these genes in a standardized way.

  11. Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

    Science.gov (United States)

    Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

    2013-09-01

    The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression

  12. Zebrafish Expression Ontology of Gene Sets (ZEOGS): A Tool to Analyze Enrichment of Zebrafish Anatomical Terms in Large Gene Sets

    Science.gov (United States)

    Marsico, Annalisa

    2013-01-01

    Abstract The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene

  13. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    Science.gov (United States)

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  14. Integration of the Gene Ontology into an object-oriented architecture

    Directory of Open Access Journals (Sweden)

    Zheng W Jim

    2005-05-01

    Full Text Available Abstract Background To standardize gene product descriptions, a formal vocabulary defined as the Gene Ontology (GO has been developed. GO terms have been categorized into biological processes, molecular functions, and cellular components. However, there is no single representation that integrates all the terms into one cohesive model. Furthermore, GO definitions have little information explaining the underlying architecture that forms these terms, such as the dynamic and static events occurring in a process. In contrast, object-oriented models have been developed to show dynamic and static events. A portion of the TGF-beta signaling pathway, which is involved in numerous cellular events including cancer, differentiation and development, was used to demonstrate the feasibility of integrating the Gene Ontology into an object-oriented model. Results Using object-oriented models we have captured the static and dynamic events that occur during a representative GO process, "transforming growth factor-beta (TGF-beta receptor complex assembly" (GO:0007181. Conclusion We demonstrate that the utility of GO terms can be enhanced by object-oriented technology, and that the GO terms can be integrated into an object-oriented model by serving as a basis for the generation of object functions and attributes.

  15. Clinical phenotype-based gene prioritization: an initial study using semantic similarity and the human phenotype ontology.

    Science.gov (United States)

    Masino, Aaron J; Dechene, Elizabeth T; Dulik, Matthew C; Wilkens, Alisha; Spinner, Nancy B; Krantz, Ian D; Pennington, Jeffrey W; Robinson, Peter N; White, Peter S

    2014-07-21

    Exome sequencing is a promising method for diagnosing patients with a complex phenotype. However, variant interpretation relative to patient phenotype can be challenging in some scenarios, particularly clinical assessment of rare complex phenotypes. Each patient's sequence reveals many possibly damaging variants that must be individually assessed to establish clear association with patient phenotype. To assist interpretation, we implemented an algorithm that ranks a given set of genes relative to patient phenotype. The algorithm orders genes by the semantic similarity computed between phenotypic descriptors associated with each gene and those describing the patient. Phenotypic descriptor terms are taken from the Human Phenotype Ontology (HPO) and semantic similarity is derived from each term's information content. Model validation was performed via simulation and with clinical data. We simulated 33 Mendelian diseases with 100 patients per disease. We modeled clinical conditions by adding noise and imprecision, i.e. phenotypic terms unrelated to the disease and terms less specific than the actual disease terms. We ranked the causative gene against all 2488 HPO annotated genes. The median causative gene rank was 1 for the optimal and noise cases, 12 for the imprecision case, and 60 for the imprecision with noise case. Additionally, we examined a clinical cohort of subjects with hearing impairment. The disease gene median rank was 22. However, when also considering the patient's exome data and filtering non-exomic and common variants, the median rank improved to 3. Semantic similarity can rank a causative gene highly within a gene list relative to patient phenotype characteristics, provided that imprecision is mitigated. The clinical case results suggest that phenotype rank combined with variant analysis provides significant improvement over the individual approaches. We expect that this combined prioritization approach may increase accuracy and decrease effort for

  16. MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction

    Directory of Open Access Journals (Sweden)

    Kohlbacher Oliver

    2009-09-01

    Full Text Available Abstract Background Knowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy. Results We extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations. Conclusion MultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.

  17. A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

    Science.gov (United States)

    Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

    2016-02-01

    Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

    Science.gov (United States)

    Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

    2010-04-01

    A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.

  19. Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks

    Directory of Open Access Journals (Sweden)

    Mazo Ilya

    2007-07-01

    Full Text Available Abstract Background Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with proteins with other functions. However, since functional annotation and protein network topology are often studied separately, the direct relationship between them has not been comprehensively demonstrated. In addition to having the general biological significance, such demonstration would further validate the data extraction and processing methods used to compose protein annotation and protein-protein interactions datasets. Results We developed a method for automatic extraction of protein functional annotation from scientific text based on the Natural Language Processing (NLP technology. For the protein annotation extracted from the entire PubMed, we evaluated the precision and recall rates, and compared the performance of the automatic extraction technology to that of manual curation used in public Gene Ontology (GO annotation. In the second part of our presentation, we reported a large-scale investigation into the correspondence between communities in the literature-based protein networks and GO annotation groups of functionally related proteins. We found a comprehensive two-way match: proteins within biological annotation groups form significantly denser linked network clusters than expected by chance and, conversely, densely linked network communities exhibit a pronounced non-random overlap with GO groups. We also expanded the publicly available GO biological process annotation using the relations extracted by our NLP technology

  20. Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network

    Science.gov (United States)

    2011-01-01

    Background Vaccine literature indexing is poorly performed in PubMed due to limited hierarchy of Medical Subject Headings (MeSH) annotation in the vaccine field. Vaccine Ontology (VO) is a community-based biomedical ontology that represents various vaccines and their relations. SciMiner is an in-house literature mining system that supports literature indexing and gene name tagging. We hypothesize that application of VO in SciMiner will aid vaccine literature indexing and mining of vaccine-gene interaction networks. As a test case, we have examined vaccines for Brucella, the causative agent of brucellosis in humans and animals. Results The VO-based SciMiner (VO-SciMiner) was developed to incorporate a total of 67 Brucella vaccine terms. A set of rules for term expansion of VO terms were learned from training data, consisting of 90 biomedical articles related to Brucella vaccine terms. VO-SciMiner demonstrated high recall (91%) and precision (99%) from testing a separate set of 100 manually selected biomedical articles. VO-SciMiner indexing exhibited superior performance in retrieving Brucella vaccine-related papers over that obtained with MeSH-based PubMed literature search. For example, a VO-SciMiner search of "live attenuated Brucella vaccine" returned 922 hits as of April 20, 2011, while a PubMed search of the same query resulted in only 74 hits. Using the abstracts of 14,947 Brucella-related papers, VO-SciMiner identified 140 Brucella genes associated with Brucella vaccines. These genes included known protective antigens, virulence factors, and genes closely related to Brucella vaccines. These VO-interacting Brucella genes were significantly over-represented in biological functional categories, including metabolite transport and metabolism, replication and repair, cell wall biogenesis, intracellular trafficking and secretion, posttranslational modification, and chaperones. Furthermore, a comprehensive interaction network of Brucella vaccines and genes were

  1. SoFoCles: feature filtering for microarray classification based on gene ontology.

    Science.gov (United States)

    Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A

    2010-02-01

    Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.

  2. Human microRNA target analysis and gene ontology clustering by GOmir, a novel stand-alone application.

    Science.gov (United States)

    Roubelakis, Maria G; Zotos, Pantelis; Papachristoudis, Georgios; Michalopoulos, Ioannis; Pappa, Kalliopi I; Anagnou, Nicholas P; Kossida, Sophia

    2009-06-16

    microRNAs (miRNAs) are single-stranded RNA molecules of about 20-23 nucleotides length found in a wide variety of organisms. miRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools as well as the experimentally supported targets from TarBase and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. GOmir (by using up to five different databases) introduces miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally, a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded BRFAA.

  3. Mapping between the OBO and OWL ontology languages.

    Science.gov (United States)

    Tirmizi, Syed Hamid; Aitken, Stuart; Moreira, Dilvan A; Mungall, Chris; Sequeda, Juan; Shah, Nigam H; Miranker, Daniel P

    2011-03-07

    Ontologies are commonly used in biomedicine to organize concepts to describe domains such as anatomies, environments, experiment, taxonomies etc. NCBO BioPortal currently hosts about 180 different biomedical ontologies. These ontologies have been mainly expressed in either the Open Biomedical Ontology (OBO) format or the Web Ontology Language (OWL). OBO emerged from the Gene Ontology, and supports most of the biomedical ontology content. In comparison, OWL is a Semantic Web language, and is supported by the World Wide Web consortium together with integral query languages, rule languages and distributed infrastructure for information interchange. These features are highly desirable for the OBO content as well. A convenient method for leveraging these features for OBO ontologies is by transforming OBO ontologies to OWL. We have developed a methodology for translating OBO ontologies to OWL using the organization of the Semantic Web itself to guide the work. The approach reveals that the constructs of OBO can be grouped together to form a similar layer cake. Thus we were able to decompose the problem into two parts. Most OBO constructs have easy and obvious equivalence to a construct in OWL. A small subset of OBO constructs requires deeper consideration. We have defined transformations for all constructs in an effort to foster a standard common mapping between OBO and OWL. Our mapping produces OWL-DL, a Description Logics based subset of OWL with desirable computational properties for efficiency and correctness. Our Java implementation of the mapping is part of the official Gene Ontology project source. Our transformation system provides a lossless roundtrip mapping for OBO ontologies, i.e. an OBO ontology may be translated to OWL and back without loss of knowledge. In addition, it provides a roadmap for bridging the gap between the two ontology languages in order to enable the use of ontology content in a language independent manner.

  4. Large-scale inference of gene function through phylogenetic annotation of Gene Ontology terms: case study of the apoptosis and autophagy cellular processes.

    Science.gov (United States)

    Feuermann, Marc; Gaudet, Pascale; Mi, Huaiyu; Lewis, Suzanna E; Thomas, Paul D

    2016-01-01

    We previously reported a paradigm for large-scale phylogenomic analysis of gene families that takes advantage of the large corpus of experimentally supported Gene Ontology (GO) annotations. This 'GO Phylogenetic Annotation' approach integrates GO annotations from evolutionarily related genes across ∼100 different organisms in the context of a gene family tree, in which curators build an explicit model of the evolution of gene functions. GO Phylogenetic Annotation models the gain and loss of functions in a gene family tree, which is used to infer the functions of uncharacterized (or incompletely characterized) gene products, even for human proteins that are relatively well studied. Here, we report our results from applying this paradigm to two well-characterized cellular processes, apoptosis and autophagy. This revealed several important observations with respect to GO annotations and how they can be used for function inference. Notably, we applied only a small fraction of the experimentally supported GO annotations to infer function in other family members. The majority of other annotations describe indirect effects, phenotypes or results from high throughput experiments. In addition, we show here how feedback from phylogenetic annotation leads to significant improvements in the PANTHER trees, the GO annotations and GO itself. Thus GO phylogenetic annotation both increases the quantity and improves the accuracy of the GO annotations provided to the research community. We expect these phylogenetically based annotations to be of broad use in gene enrichment analysis as well as other applications of GO annotations.Database URL: http://amigo.geneontology.org/amigo. © The Author(s) 2016. Published by Oxford University Press.

  5. An ontology-driven semantic mashup of gene and biological pathway information: application to the domain of nicotine dependence.

    Science.gov (United States)

    Sahoo, Satya S; Bodenreider, Olivier; Rutter, Joni L; Skinner, Karen J; Sheth, Amit P

    2008-10-01

    This paper illustrates how Semantic Web technologies (especially RDF, OWL, and SPARQL) can support information integration and make it easy to create semantic mashups (semantically integrated resources). In the context of understanding the genetic basis of nicotine dependence, we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. We use an ontology-driven approach to integrate two gene resources (Entrez Gene and HomoloGene) and three pathway resources (KEGG, Reactome and BioCyc), for five organisms, including humans. We created the Entrez Knowledge Model (EKoM), an information model in OWL for the gene resources, and integrated it with the extant BioPAX ontology designed for pathway resources. The integrated schema is populated with data from the pathway resources, publicly available in BioPAX-compatible format, and gene resources for which a population procedure was created. The SPARQL query language is used to formulate queries over the integrated knowledge base to answer the three biological queries. Simple SPARQL queries could easily identify hub genes, i.e., those genes whose gene products participate in many pathways or interact with many other gene products. The identification of the genes expressed in the brain turned out to be more difficult, due to the lack of a common identification scheme for proteins. Semantic Web technologies provide a valid framework for information integration in the life sciences. Ontology-driven integration represents a flexible, sustainable and extensible solution to the integration of large volumes of information. Additional resources, which enable the creation of mappings between information sources, are required to compensate for heterogeneity across namespaces. RESOURCE PAGE: http://knoesis.wright.edu/research/lifesci/integration/structured_data/JBI-2008/

  6. The Gene Ontology Differs in Bursa of Fabricius Between Two Breeds of Ducks Post Hatching by Enriching the Differentially Expressed Genes

    Directory of Open Access Journals (Sweden)

    H Liu

    Full Text Available ABSTRACT The bursa of Fabricius (BF is the central humoral immune organ unique to birds. The present study investigated the possible difference on a molecular level between two duck breeds. The digital gene expression profiling (DGE technology was used to enrich the differentially expressed genes (DEGs in BF between the Jianchang and Nonghua-P strains of ducks. DGE data identified 195 DEGs in the bursa. Gene Ontology (GO analysis suggested that DEGs were mainly enriched in the metabolic pathways and ribosome components. Pathways analysis identified the spliceosome, RNA transport, RNA degradation process, Jak-STAT signaling pathway, TNF signaling pathway and B cell receptor signaling pathway. The results indicated that the main difference in the BF between the two duck strains was in the capabilities of protein formation and B cell development. These data have revealed the main divergence in the BF on a molecular level between genetically different duck breeds and may help to perform molecular breeding programs in poultry in the future.

  7. False positive reduction in protein-protein interaction predictions using gene ontology annotations

    Directory of Open Access Journals (Sweden)

    Lin Yen-Han

    2007-07-01

    Full Text Available Abstract Background Many crucial cellular operations such as metabolism, signalling, and regulations are based on protein-protein interactions. However, the lack of robust protein-protein interaction information is a challenge. One reason for the lack of solid protein-protein interaction information is poor agreement between experimental findings and computational sets that, in turn, comes from huge false positive predictions in computational approaches. Reduction of false positive predictions and enhancing true positive fraction of computationally predicted protein-protein interaction datasets based on highly confident experimental results has not been adequately investigated. Results Gene Ontology (GO annotations were used to reduce false positive protein-protein interactions (PPI pairs resulting from computational predictions. Using experimentally obtained PPI pairs as a training dataset, eight top-ranking keywords were extracted from GO molecular function annotations. The sensitivity of these keywords is 64.21% in the yeast experimental dataset and 80.83% in the worm experimental dataset. The specificities, a measure of recovery power, of these keywords applied to four predicted PPI datasets for each studied organisms, are 48.32% and 46.49% (by average of four datasets in yeast and worm, respectively. Based on eight top-ranking keywords and co-localization of interacting proteins a set of two knowledge rules were deduced and applied to remove false positive protein pairs. The 'strength', a measure of improvement provided by the rules was defined based on the signal-to-noise ratio and implemented to measure the applicability of knowledge rules applying to the predicted PPI datasets. Depending on the employed PPI-predicting methods, the strength varies between two and ten-fold of randomly removing protein pairs from the datasets. Conclusion Gene Ontology annotations along with the deduced knowledge rules could be implemented to partially

  8. Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

    Science.gov (United States)

    Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

    2012-01-01

    Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function. PMID:22368382

  9. Protein-Protein Interactions Prediction Based on Iterative Clique Extension with Gene Ontology Filtering

    Directory of Open Access Journals (Sweden)

    Lei Yang

    2014-01-01

    Full Text Available Cliques (maximal complete subnets in protein-protein interaction (PPI network are an important resource used to analyze protein complexes and functional modules. Clique-based methods of predicting PPI complement the data defection from biological experiments. However, clique-based predicting methods only depend on the topology of network. The false-positive and false-negative interactions in a network usually interfere with prediction. Therefore, we propose a method combining clique-based method of prediction and gene ontology (GO annotations to overcome the shortcoming and improve the accuracy of predictions. According to different GO correcting rules, we generate two predicted interaction sets which guarantee the quality and quantity of predicted protein interactions. The proposed method is applied to the PPI network from the Database of Interacting Proteins (DIP and most of the predicted interactions are verified by another biological database, BioGRID. The predicted protein interactions are appended to the original protein network, which leads to clique extension and shows the significance of biological meaning.

  10. GO Explorer: A gene-ontology tool to aid in the interpretation of shotgun proteomics data

    Directory of Open Access Journals (Sweden)

    Domont Gilberto B

    2009-02-01

    Full Text Available Abstract Background Spectral counting is a shotgun proteomics approach comprising the identification and relative quantitation of thousands of proteins in complex mixtures. However, this strategy generates bewildering amounts of data whose biological interpretation is a challenge. Results Here we present a new algorithm, termed GO Explorer (GOEx, that leverages the gene ontology (GO to aid in the interpretation of proteomic data. GOEx stands out because it combines data from protein fold changes with GO over-representation statistics to help draw conclusions. Moreover, it is tightly integrated within the PatternLab for Proteomics project and, thus, lies within a complete computational environment that provides parsers and pattern recognition tools designed for spectral counting. GOEx offers three independent methods to query data: an interactive directed acyclic graph, a specialist mode where key words can be searched, and an automatic search. Its usefulness is demonstrated by applying it to help interpret the effects of perillyl alcohol, a natural chemotherapeutic agent, on glioblastoma multiform cell lines (A172. We used a new multi-surfactant shotgun proteomic strategy and identified more than 2600 proteins; GOEx pinpointed key sets of differentially expressed proteins related to cell cycle, alcohol catabolism, the Ras pathway, apoptosis, and stress response, to name a few. Conclusion GOEx facilitates organism-specific studies by leveraging GO and providing a rich graphical user interface. It is a simple to use tool, specialized for biologists who wish to analyze spectral counting data from shotgun proteomics. GOEx is available at http://pcarvalho.com/patternlab.

  11. GO Explorer: A gene-ontology tool to aid in the interpretation of shotgun proteomics data.

    Science.gov (United States)

    Carvalho, Paulo C; Fischer, Juliana Sg; Chen, Emily I; Domont, Gilberto B; Carvalho, Maria Gc; Degrave, Wim M; Yates, John R; Barbosa, Valmir C

    2009-02-24

    Spectral counting is a shotgun proteomics approach comprising the identification and relative quantitation of thousands of proteins in complex mixtures. However, this strategy generates bewildering amounts of data whose biological interpretation is a challenge. Here we present a new algorithm, termed GO Explorer (GOEx), that leverages the gene ontology (GO) to aid in the interpretation of proteomic data. GOEx stands out because it combines data from protein fold changes with GO over-representation statistics to help draw conclusions. Moreover, it is tightly integrated within the PatternLab for Proteomics project and, thus, lies within a complete computational environment that provides parsers and pattern recognition tools designed for spectral counting. GOEx offers three independent methods to query data: an interactive directed acyclic graph, a specialist mode where key words can be searched, and an automatic search. Its usefulness is demonstrated by applying it to help interpret the effects of perillyl alcohol, a natural chemotherapeutic agent, on glioblastoma multiform cell lines (A172). We used a new multi-surfactant shotgun proteomic strategy and identified more than 2600 proteins; GOEx pinpointed key sets of differentially expressed proteins related to cell cycle, alcohol catabolism, the Ras pathway, apoptosis, and stress response, to name a few. GOEx facilitates organism-specific studies by leveraging GO and providing a rich graphical user interface. It is a simple to use tool, specialized for biologists who wish to analyze spectral counting data from shotgun proteomics. GOEx is available at http://pcarvalho.com/patternlab.

  12. Annotating activation/inhibition relationships to protein-protein interactions using gene ontology relations.

    Science.gov (United States)

    Yim, Soorin; Yu, Hasun; Jang, Dongjin; Lee, Doheon

    2018-04-11

    Signaling pathways can be reconstructed by identifying 'effect types' (i.e. activation/inhibition) of protein-protein interactions (PPIs). Effect types are composed of 'directions' (i.e. upstream/downstream) and 'signs' (i.e. positive/negative), thereby requiring directions as well as signs of PPIs to predict signaling events from PPI networks. Here, we propose a computational method for systemically annotating effect types to PPIs using relations between functional information of proteins. We used regulates, positively regulates, and negatively regulates relations in Gene Ontology (GO) to predict directions and signs of PPIs. These relations indicate both directions and signs between GO terms so that we can project directions and signs between relevant GO terms to PPIs. Independent test results showed that our method is effective for predicting both directions and signs of PPIs. Moreover, our method outperformed a previous GO-based method that did not consider the relations between GO terms. We annotated effect types to human PPIs and validated several highly confident effect types against literature. The annotated human PPIs are available in Additional file 2 to aid signaling pathway reconstruction and network biology research. We annotated effect types to PPIs by using regulates, positively regulates, and negatively regulates relations in GO. We demonstrated that those relations are effective for predicting not only signs, but also directions of PPIs. The usefulness of those relations suggests their potential applications to other types of interactions such as protein-DNA interactions.

  13. Closing the loop: from paper to protein annotation using supervised Gene Ontology classification.

    Science.gov (United States)

    Gobeill, Julien; Pasche, Emilie; Vishnyakova, Dina; Ruch, Patrick

    2014-01-01

    Gene function curation of the literature with Gene Ontology (GO) concepts is one particularly time-consuming task in genomics, and the help from bioinformatics is highly requested to keep up with the flow of publications. In 2004, the first BioCreative challenge already designed a task of automatic GO concepts assignment from a full text. At this time, results were judged far from reaching the performances required by real curation workflows. In particular, supervised approaches produced the most disappointing results because of lack of training data. Ten years later, the available curation data have massively grown. In 2013, the BioCreative IV GO task revisited the automatic GO assignment task. For this issue, we investigated the power of our supervised classifier, GOCat. GOCat computes similarities between an input text and already curated instances contained in a knowledge base to infer GO concepts. The subtask A consisted in selecting GO evidence sentences for a relevant gene in a full text. For this, we designed a state-of-the-art supervised statistical approach, using a naïve Bayes classifier and the official training set, and obtained fair results. The subtask B consisted in predicting GO concepts from the previous output. For this, we applied GOCat and reached leading results, up to 65% for hierarchical recall in the top 20 outputted concepts. Contrary to previous competitions, machine learning has this time outperformed standard dictionary-based approaches. Thanks to BioCreative IV, we were able to design a complete workflow for curation: given a gene name and a full text, this system is able to select evidence sentences for curation and to deliver highly relevant GO concepts. Contrary to previous competitions, machine learning this time outperformed dictionary-based systems. Observed performances are sufficient for being used in a real semiautomatic curation workflow. GOCat is available at http://eagl.unige.ch/GOCat/. http://eagl.unige.ch/GOCat4FT/.

  14. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.

    Science.gov (United States)

    Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D

    2017-01-04

    The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. The Proteasix Ontology.

    Science.gov (United States)

    Arguello Casteleiro, Mercedes; Klein, Julie; Stevens, Robert

    2016-06-04

    The Proteasix Ontology (PxO) is an ontology that supports the Proteasix tool; an open-source peptide-centric tool that can be used to predict automatically and in a large-scale fashion in silico the proteases involved in the generation of proteolytic cleavage fragments (peptides) The PxO re-uses parts of the Protein Ontology, the three Gene Ontology sub-ontologies, the Chemical Entities of Biological Interest Ontology, the Sequence Ontology and bespoke extensions to the PxO in support of a series of roles: 1. To describe the known proteases and their target cleaveage sites. 2. To enable the description of proteolytic cleaveage fragments as the outputs of observed and predicted proteolysis. 3. To use knowledge about the function, species and cellular location of a protease and protein substrate to support the prioritisation of proteases in observed and predicted proteolysis. The PxO is designed to describe the biological underpinnings of the generation of peptides. The peptide-centric PxO seeks to support the Proteasix tool by separating domain knowledge from the operational knowledge used in protease prediction by Proteasix and to support the confirmation of its analyses and results. The Proteasix Ontology may be found at: http://bioportal.bioontology.org/ontologies/PXO . This ontology is free and open for use by everyone.

  16. Extracting gene expression patterns and identifying co-expressed genes from microarray data reveals biologically responsive processes

    Directory of Open Access Journals (Sweden)

    Paules Richard S

    2007-11-01

    Full Text Available Abstract Background A common observation in the analysis of gene expression data is that many genes display similarity in their expression patterns and therefore appear to be co-regulated. However, the variation associated with microarray data and the complexity of the experimental designs make the acquisition of co-expressed genes a challenge. We developed a novel method for Extracting microarray gene expression Patterns and Identifying co-expressed Genes, designated as EPIG. The approach utilizes the underlying structure of gene expression data to extract patterns and identify co-expressed genes that are responsive to experimental conditions. Results Through evaluation of the correlations among profiles, the magnitude of variation in gene expression profiles, and profile signal-to-noise ratio's, EPIG extracts a set of patterns representing co-expressed genes. The method is shown to work well with a simulated data set and microarray data obtained from time-series studies of dauer recovery and L1 starvation in C. elegans and after ultraviolet (UV or ionizing radiation (IR-induced DNA damage in diploid human fibroblasts. With the simulated data set, EPIG extracted the appropriate number of patterns which were more stable and homogeneous than the set of patterns that were determined using the CLICK or CAST clustering algorithms. However, CLICK performed better than EPIG and CAST with respect to the average correlation between clusters/patterns of the simulated data. With real biological data, EPIG extracted more dauer-specific patterns than CLICK. Furthermore, analysis of the IR/UV data revealed 18 unique patterns and 2661 genes out of approximately 17,000 that were identified as significantly expressed and categorized to the patterns by EPIG. The time-dependent patterns displayed similar and dissimilar responses between IR and UV treatments. Gene Ontology analysis applied to each pattern-related subset of co-expressed genes revealed underlying

  17. Prioritising lexical patterns to increase axiomatisation in biomedical ontologies. The role of localisation and modularity.

    Science.gov (United States)

    Quesada-Martínez, M; Fernández-Breis, J T; Stevens, R; Mikroyannidi, E

    2015-01-01

    This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". In previous work, we have defined methods for the extraction of lexical patterns from labels as an initial step towards semi-automatic ontology enrichment methods. Our previous findings revealed that many biomedical ontologies could benefit from enrichment methods using lexical patterns as a starting point.Here, we aim to identify which lexical patterns are appropriate for ontology enrichment, driving its analysis by metrics to prioritised the patterns. We propose metrics for suggesting which lexical regularities should be the starting point to enrich complex ontologies. Our method determines the relevance of a lexical pattern by measuring its locality in the ontology, that is, the distance between the classes associated with the pattern, and the distribution of the pattern in a certain module of the ontology. The methods have been applied to four significant biomedical ontologies including the Gene Ontology and SNOMED CT. The metrics provide information about the engineering of the ontologies and the relevance of the patterns. Our method enables the suggestion of links between classes that are not made explicit in the ontology. We propose a prioritisation of the lexical patterns found in the analysed ontologies. The locality and distribution of lexical patterns offer insights into the further engineering of the ontology. Developers can use this information to improve the axiomatisation of their ontologies.

  18. Transcriptome analysis reveals key differentially expressed genes involved in wheat grain development

    Directory of Open Access Journals (Sweden)

    Yonglong Yu

    2016-04-01

    Full Text Available Wheat seed development is an important physiological process of seed maturation and directly affects wheat yield and quality. In this study, we performed dynamic transcriptome microarray analysis of an elite Chinese bread wheat cultivar (Jimai 20 during grain development using the GeneChip Wheat Genome Array. Grain morphology and scanning electron microscope observations showed that the period of 11–15 days post-anthesis (DPA was a key stage for the synthesis and accumulation of seed starch. Genome-wide transcriptional profiling and significance analysis of microarrays revealed that the period from 11 to 15 DPA was more important than the 15–20 DPA stage for the synthesis and accumulation of nutritive reserves. Series test of cluster analysis of differential genes revealed five statistically significant gene expression profiles. Gene ontology annotation and enrichment analysis gave further information about differentially expressed genes, and MapMan analysis revealed expression changes within functional groups during seed development. Metabolic pathway network analysis showed that major and minor metabolic pathways regulate one another to ensure regular seed development and nutritive reserve accumulation. We performed gene co-expression network analysis to identify genes that play vital roles in seed development and identified several key genes involved in important metabolic pathways. The transcriptional expression of eight key genes involved in starch and protein synthesis and stress defense was further validated by qRT-PCR. Our results provide new insight into the molecular mechanisms of wheat seed development and the determinants of yield and quality.

  19. Gene-ontology enrichment analysis in two independent family-based samples highlights biologically plausible processes for autism spectrum disorders.

    LENUS (Irish Health Repository)

    Anney, Richard J L

    2012-02-01

    Recent genome-wide association studies (GWAS) have implicated a range of genes from discrete biological pathways in the aetiology of autism. However, despite the strong influence of genetic factors, association studies have yet to identify statistically robust, replicated major effect genes or SNPs. We apply the principle of the SNP ratio test methodology described by O\\'Dushlaine et al to over 2100 families from the Autism Genome Project (AGP). Using a two-stage design we examine association enrichment in 5955 unique gene-ontology classifications across four groupings based on two phenotypic and two ancestral classifications. Based on estimates from simulation we identify excess of association enrichment across all analyses. We observe enrichment in association for sets of genes involved in diverse biological processes, including pyruvate metabolism, transcription factor activation, cell-signalling and cell-cycle regulation. Both genes and processes that show enrichment have previously been examined in autistic disorders and offer biologically plausibility to these findings.

  20. PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

    Science.gov (United States)

    Hawkins, Troy; Chitale, Meghana; Luban, Stanislav; Kihara, Daisuke

    2009-02-15

    Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http

  1. DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

    Science.gov (United States)

    Mazandu, Gaston K; Mulder, Nicola J

    2013-09-25

    The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.

  2. Ontology Design Patterns for Combining Pathology and Anatomy: Application to Study Aging and Longevity in Inbred Mouse Strains

    KAUST Repository

    Alghamdi, Sarah M.

    2018-05-13

    In biomedical research, ontologies are widely used to represent knowledge as well as to annotate datasets. Many of the existing ontologies cover a single type of phenomena, such as a process, cell type, gene, pathological entity or anatomical structure. Consequently, there is a requirement to use multiple ontologies to fully characterize the observations in the datasets. Although this allows precise annotation of different aspects of a given dataset, it limits our ability to use the ontologies in data analysis, as the ontologies are usually disconnected and their combinations cannot be exploited. Motivated by this, here we present novel ontology design methods for combining pathology and anatomy concepts. To this end, we use a dataset of mouse models which has been characterized through two ontologies: one of them is the mouse pathology ontology (MPATH) covering pathological lesions while the other is the mouse anatomy ontology (MA) covering the anatomical site of the lesions. We propose four novel ontology design patterns for combining these ontologies, and use these patterns to generate four ontologies in a data-driven way. To evaluate the generated ontologies, we utilize these in ontology-based data analysis, including ontology enrichment analysis and computation of semantic similarity. We demonstrate that there are significant differences between the four ontologies in different analysis approaches. In addition, when using semantic similarity to confirm the hypothesis that genetically identical mice should develop more similar diseases, the generated combined ontologies lead to significantly better analysis results compared to using each ontology individually. Our results reveal that using ontology design patterns to combine different facets characterizing a dataset can improve established analysis methods.

  3. GGDonto ontology as a knowledge-base for genetic diseases and disorders of glycan metabolism and their causative genes.

    Science.gov (United States)

    Solovieva, Elena; Shikanai, Toshihide; Fujita, Noriaki; Narimatsu, Hisashi

    2018-04-18

    Inherited mutations in glyco-related genes can affect the biosynthesis and degradation of glycans and result in severe genetic diseases and disorders. The Glyco-Disease Genes Database (GDGDB), which provides information about these diseases and disorders as well as their causative genes, has been developed by the Research Center for Medical Glycoscience (RCMG) and released in April 2010. GDGDB currently provides information on about 80 genetic diseases and disorders caused by single-gene mutations in glyco-related genes. Many biomedical resources provide information about genetic disorders and genes involved in their pathogenesis, but resources focused on genetic disorders known to be related to glycan metabolism are lacking. With the aim of providing more comprehensive knowledge on genetic diseases and disorders of glycan biosynthesis and degradation, we enriched the content of the GDGDB database and improved the methods for data representation. We developed the Genetic Glyco-Diseases Ontology (GGDonto) and a RDF/SPARQL-based user interface using Semantic Web technologies. In particular, we represented the GGDonto content using Semantic Web languages, such as RDF, RDFS, SKOS, and OWL, and created an interactive user interface based on SPARQL queries. This user interface provides features to browse the hierarchy of the ontology, view detailed information on diseases and related genes, and find relevant background information. Moreover, it provides the ability to filter and search information by faceted and keyword searches. Focused on the molecular etiology, pathogenesis, and clinical manifestations of genetic diseases and disorders of glycan metabolism and developed as a knowledge-base for this scientific field, GGDonto provides comprehensive information on various topics, including links to aid the integration with other scientific resources. The availability and accessibility of this knowledge will help users better understand how genetic defects impact the

  4. GOssTo: a stand-alone application and a web tool for calculating semantic similarities on the Gene Ontology.

    Science.gov (United States)

    Caniza, Horacio; Romero, Alfonso E; Heron, Samuel; Yang, Haixuan; Devoto, Alessandra; Frasca, Marco; Mesiti, Marco; Valentini, Giorgio; Paccanaro, Alberto

    2014-08-01

    We present GOssTo, the Gene Ontology semantic similarity Tool, a user-friendly software system for calculating semantic similarities between gene products according to the Gene Ontology. GOssTo is bundled with six semantic similarity measures, including both term- and graph-based measures, and has extension capabilities to allow the user to add new similarities. Importantly, for any measure, GOssTo can also calculate the Random Walk Contribution that has been shown to greatly improve the accuracy of similarity measures. GOssTo is very fast, easy to use, and it allows the calculation of similarities on a genomic scale in a few minutes on a regular desktop machine. alberto@cs.rhul.ac.uk GOssTo is available both as a stand-alone application running on GNU/Linux, Windows and MacOS from www.paccanarolab.org/gossto and as a web application from www.paccanarolab.org/gosstoweb. The stand-alone application features a simple and concise command line interface for easy integration into high-throughput data processing pipelines. © The Author 2014. Published by Oxford University Press.

  5. Global gene expression analysis of the zoonotic parasite Trichinella spiralis revealed novel genes in host parasite interaction.

    Directory of Open Access Journals (Sweden)

    Xiaolei Liu

    Full Text Available BACKGROUND: Trichinellosis is a typical food-borne zoonotic disease which is epidemic worldwide and the nematode Trichinella spiralis is the main pathogen. The life cycle of T. spiralis contains three developmental stages, i.e. adult worms, new borne larva (new borne L1 larva and muscular larva (infective L1 larva. Stage-specific gene expression in the parasites has been investigated with various immunological and cDNA cloning approaches, whereas the genome-wide transcriptome and expression features of the parasite have been largely unknown. The availability of the genome sequence information of T. spiralis has made it possible to deeply dissect parasite biology in association with global gene expression and pathogenesis. METHODOLOGY AND PRINCIPAL FINDINGS: In this study, we analyzed the global gene expression patterns in the three developmental stages of T. spiralis using digital gene expression (DGE analysis. Almost 15 million sequence tags were generated with the Illumina RNA-seq technology, producing expression data for more than 9,000 genes, covering 65% of the genome. The transcriptome analysis revealed thousands of differentially expressed genes within the genome, and importantly, a panel of genes encoding functional proteins associated with parasite invasion and immuno-modulation were identified. More than 45% of the genes were found to be transcribed from both strands, indicating the importance of RNA-mediated gene regulation in the development of the parasite. Further, based on gene ontological analysis, over 3000 genes were functionally categorized and biological pathways in the three life cycle stage were elucidated. CONCLUSIONS AND SIGNIFICANCE: The global transcriptome of T. spiralis in three developmental stages has been profiled, and most gene activity in the genome was found to be developmentally regulated. Many metabolic and biological pathways have been revealed. The findings of the differential expression of several protein

  6. Signalign: An Ontology of DNA as Signal for Comparative Gene Structure Prediction Using Information-Coding-and-Processing Techniques.

    Science.gov (United States)

    Yu, Ning; Guo, Xuan; Gu, Feng; Pan, Yi

    2016-03-01

    Conventional character-analysis-based techniques in genome analysis manifest three main shortcomings-inefficiency, inflexibility, and incompatibility. In our previous research, a general framework, called DNA As X was proposed for character-analysis-free techniques to overcome these shortcomings, where X is the intermediates, such as digit, code, signal, vector, tree, graph network, and so on. In this paper, we further implement an ontology of DNA As Signal, by designing a tool named Signalign for comparative gene structure analysis, in which DNA sequences are converted into signal series, processed by modified method of dynamic time warping and measured by signal-to-noise ratio (SNR). The ontology of DNA As Signal integrates the principles and concepts of other disciplines including information coding theory and signal processing into sequence analysis and processing. Comparing with conventional character-analysis-based methods, Signalign can not only have the equivalent or superior performance, but also enrich the tools and the knowledge library of computational biology by extending the domain from character/string to diverse areas. The evaluation results validate the success of the character-analysis-free technique for improved performances in comparative gene structure prediction.

  7. Expression profiling and gene ontology analysis in fathead minnow (Pimephales promelas) liver following exposure to pulp and paper mill effluents

    Energy Technology Data Exchange (ETDEWEB)

    Costigan, Shannon L.; Werner, Julieta; Ouellet, Jacob D.; Hill, Lauren G. [Department of Biology, Lakehead University, 955 Oliver Road, Ontario P7B 5E1, (Canada); Law, R. David, E-mail: dlaw@lakeheadu.ca [Department of Biology, Lakehead University, 955 Oliver Road, Ontario P7B 5E1, (Canada)

    2012-10-15

    Many studies link pulp and paper mill effluent (PPME) exposure to adverse effects in fish populations present in the mill receiving environments. These impacts are often characteristic of endocrine disruption and may include impaired reproduction, development and survival. While these physiological endpoints are well-characterized, the molecular mechanisms causing them are not yet understood. To investigate changes in gene transcription induced by exposure to a PPME at several stages of treatment, male and female fathead minnows (FHMs) were exposed for 6 days to 25% (v/v) secondary (biologically) treated kraft effluent (TK) or 100% (v/v) combined mill outfall (CMO) from a mill producing both kraft pulp and newsprint. The gene expression changes in the livers of these fish were analyzed using a 22 K oligonucleotide microarray. Exposure to TK or CMO resulted in significant changes in the expression levels of 105 and 238 targets in male FHMs and 296 and 133 targets in females, respectively. Targets were then functionally analyzed using gene ontology tools to identify the biological processes in fish hepatocytes that were affected by exposure to PPME after its secondary treatment. Proteolysis was affected in female FHMs exposed to both TK and CMO. In male FHMs, no processes were affected by TK exposure, while sterol, isoprenoid, steroid and cholesterol biosynthesis and electron transport were up-regulated by CMO exposure. The results presented in this study indicate that short-term exposure to PPMEs affects the expression of reproduction-related genes in the livers of both male and female FHMs, and that secondary treatment of PPMEs may not neutralize all of their metabolic effects in fish. Gene ontology analysis of microarray data may enable identification of biological processes altered by toxicant exposure and thus provide an additional tool for monitoring the impact of PPMEs on fish populations.

  8. Expression profiling and gene ontology analysis in fathead minnow (Pimephales promelas) liver following exposure to pulp and paper mill effluents

    International Nuclear Information System (INIS)

    Costigan, Shannon L.; Werner, Julieta; Ouellet, Jacob D.; Hill, Lauren G.; Law, R. David

    2012-01-01

    Many studies link pulp and paper mill effluent (PPME) exposure to adverse effects in fish populations present in the mill receiving environments. These impacts are often characteristic of endocrine disruption and may include impaired reproduction, development and survival. While these physiological endpoints are well-characterized, the molecular mechanisms causing them are not yet understood. To investigate changes in gene transcription induced by exposure to a PPME at several stages of treatment, male and female fathead minnows (FHMs) were exposed for 6 days to 25% (v/v) secondary (biologically) treated kraft effluent (TK) or 100% (v/v) combined mill outfall (CMO) from a mill producing both kraft pulp and newsprint. The gene expression changes in the livers of these fish were analyzed using a 22 K oligonucleotide microarray. Exposure to TK or CMO resulted in significant changes in the expression levels of 105 and 238 targets in male FHMs and 296 and 133 targets in females, respectively. Targets were then functionally analyzed using gene ontology tools to identify the biological processes in fish hepatocytes that were affected by exposure to PPME after its secondary treatment. Proteolysis was affected in female FHMs exposed to both TK and CMO. In male FHMs, no processes were affected by TK exposure, while sterol, isoprenoid, steroid and cholesterol biosynthesis and electron transport were up-regulated by CMO exposure. The results presented in this study indicate that short-term exposure to PPMEs affects the expression of reproduction-related genes in the livers of both male and female FHMs, and that secondary treatment of PPMEs may not neutralize all of their metabolic effects in fish. Gene ontology analysis of microarray data may enable identification of biological processes altered by toxicant exposure and thus provide an additional tool for monitoring the impact of PPMEs on fish populations.

  9. Systems-level analysis of risk genes reveals the modular nature of schizophrenia.

    Science.gov (United States)

    Liu, Jiewei; Li, Ming; Luo, Xiong-Jian; Su, Bing

    2018-05-19

    Schizophrenia (SCZ) is a complex mental disorder with high heritability. Genetic studies (especially recent genome-wide association studies) have identified many risk genes for schizophrenia. However, the physical interactions among the proteins encoded by schizophrenia risk genes remain elusive and it is not known whether the identified risk genes converge on common molecular networks or pathways. Here we systematically investigated the network characteristics of schizophrenia risk genes using the high-confidence protein-protein interactions (PPI) from the human interactome. We found that schizophrenia risk genes encode a densely interconnected PPI network (P = 4.15 × 10 -31 ). Compared with the background genes, the schizophrenia risk genes in the interactome have significantly higher degree (P = 5.39 × 10 -11 ), closeness centrality (P = 7.56 × 10 -11 ), betweeness centrality (P = 1.29 × 10 -11 ), clustering coefficient (P = 2.22 × 10 -2 ), and shorter average shortest path length (P = 7.56 × 10 -11 ). Based on the densely interconnected PPI network, we identified 48 hub genes and 4 modules formed by highly interconnected schizophrenia genes. We showed that the proteins encoded by schizophrenia hub genes have significantly more direct physical interactions. Gene ontology (GO) analysis revealed that cell adhesion, cell cycle, immune system response, and GABR-receptor complex categories were enriched in the modules formed by highly interconnected schizophrenia risk genes. Our study reveals that schizophrenia risk genes encode a densely interconnected molecular network and demonstrates the modular nature of schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Quantum ontologies

    International Nuclear Information System (INIS)

    Stapp, H.P.

    1988-12-01

    Quantum ontologies are conceptions of the constitution of the universe that are compatible with quantum theory. The ontological orientation is contrasted to the pragmatic orientation of science, and reasons are given for considering quantum ontologies both within science, and in broader contexts. The principal quantum ontologies are described and evaluated. Invited paper at conference: Bell's Theorem, Quantum Theory, and Conceptions of the Universe, George Mason University, October 20-21, 1988. 16 refs

  11. Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.

    Science.gov (United States)

    Zhou, Hang; Yang, Yang; Shen, Hong-Bin

    2017-03-15

    Protein subcellular localization prediction has been an important research topic in computational biology over the last decade. Various automatic methods have been proposed to predict locations for large scale protein datasets, where statistical machine learning algorithms are widely used for model construction. A key step in these predictors is encoding the amino acid sequences into feature vectors. Many studies have shown that features extracted from biological domains, such as gene ontology and functional domains, can be very useful for improving the prediction accuracy. However, domain knowledge usually results in redundant features and high-dimensional feature spaces, which may degenerate the performance of machine learning models. In this paper, we propose a new amino acid sequence-based human protein subcellular location prediction approach Hum-mPLoc 3.0, which covers 12 human subcellular localizations. The sequences are represented by multi-view complementary features, i.e. context vocabulary annotation-based gene ontology (GO) terms, peptide-based functional domains, and residue-based statistical features. To systematically reflect the structural hierarchy of the domain knowledge bases, we propose a novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms. Experimental results on four benchmark datasets show that HCM improves prediction accuracy by 5-11% and F 1 by 8-19% compared with conventional GO-based methods. A large-scale application of Hum-mPLoc 3.0 on the whole human proteome reveals proteins co-localization preferences in the cell. www.csbio.sjtu.edu.cn/bioinf/Hum-mPLoc3/. hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  12. Witnessing stressful events induces glutamatergic synapse pathway alterations and gene set enrichment of positive EPSP regulation within the VTA of adult mice: An ontology based approach

    Science.gov (United States)

    Brewer, Jacob S.

    It is well known that exposure to severe stress increases the risk for developing mood disorders. Currently, the neurobiological and genetic mechanisms underlying the functional effects of psychological stress are poorly understood. Presenting a major obstacle to the study of psychological stress is the inability of current animal models of stress to distinguish between physical and psychological stressors. A novel paradigm recently developed by Warren et al., is able to tease apart the effects of physical and psychological stress in adult mice by allowing these mice to "witness," the social defeat of another mouse thus removing confounding variables associated with physical stressors. Using this 'witness' model of stress and RNA-Seq technology, the current study aims to study the genetic effects of psychological stress. After, witnessing the social defeat of another mouse, VTA tissue was extracted, sequenced, and analyzed for differential expression. Since genes often work together in complex networks, a pathway and gene ontology (GO) analysis was performed using data from the differential expression analysis. The pathway and GO analyzes revealed a perturbation of the glutamatergic synapse pathway and an enrichment of positive excitatory post-synaptic potential regulation. This is consistent with the excitatory synapse theory of depression. Together these findings demonstrate a dysregulation of the mesolimbic reward pathway at the gene level as a result of psychological stress potentially contributing to depressive like behaviors.

  13. Generating Gene Ontology-Disease Inferences to Explore Mechanisms of Human Disease at the Comparative Toxicogenomics Database.

    Directory of Open Access Journals (Sweden)

    Allan Peter Davis

    Full Text Available Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/ manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO file from NCBI Gene. By integrating these GO-gene annotations with CTD's gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and

  14. DOSE RESPONSE FROM HIGH THROUGHPUT GENE EXPRESSION STUDIES AND THE INFLUENCE OF TIME AND CELL LINE ON INFERRED MODE OF ACTION BY ONTOLOGIC ENRICHMENT (SOT)

    Science.gov (United States)

    Gene expression with ontologic enrichment and connectivity mapping tools is widely used to infer modes of action (MOA) for therapeutic drugs. Despite progress in high-throughput (HT) genomic systems, strategies suitable to identify industrial chemical MOA are needed. The L1000 is...

  15. Ontology-based Information Retrieval

    DEFF Research Database (Denmark)

    Styltsvig, Henrik Bulskov

    In this thesis, we will present methods for introducing ontologies in information retrieval. The main hypothesis is that the inclusion of conceptual knowledge such as ontologies in the information retrieval process can contribute to the solution of major problems currently found in information...... retrieval. This utilization of ontologies has a number of challenges. Our focus is on the use of similarity measures derived from the knowledge about relations between concepts in ontologies, the recognition of semantic information in texts and the mapping of this knowledge into the ontologies in use......, as well as how to fuse together the ideas of ontological similarity and ontological indexing into a realistic information retrieval scenario. To achieve the recognition of semantic knowledge in a text, shallow natural language processing is used during indexing that reveals knowledge to the level of noun...

  16. HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2014-01-01

    Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/.

  17. Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms.

    Science.gov (United States)

    Falda, Marco; Toppo, Stefano; Pescarolo, Alessandro; Lavezzo, Enrico; Di Camillo, Barbara; Facchinetti, Andrea; Cilia, Elisa; Velasco, Riccardo; Fontana, Paolo

    2012-03-28

    Predicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods. Argot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes. The revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2.

  18. On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report.

    Directory of Open Access Journals (Sweden)

    Paul D Thomas

    Full Text Available A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011 has proposed a metric for the "functional similarity" between two genes that uses only the Gene Ontology (GO annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the "ortholog conjecture" (or, more properly, the "ortholog functional conservation hypothesis". First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1 that GO annotations are often incomplete, potentially in a biased manner, and subject to an "open world assumption" (absence of an annotation does not imply absence of a function, and 2 that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the

  19. Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

    Science.gov (United States)

    Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

    2018-01-01

    Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.

  20. Knowledge retrieval from PubMed abstracts and electronic medical records with the Multiple Sclerosis Ontology.

    Science.gov (United States)

    Malhotra, Ashutosh; Gündel, Michaela; Rajput, Abdul Mateen; Mevissen, Heinz-Theodor; Saiz, Albert; Pastor, Xavier; Lozano-Rubi, Raimundo; Martinez-Lapiscina, Elena H; Martinez-Lapsicina, Elena H; Zubizarreta, Irati; Mueller, Bernd; Kotelnikova, Ekaterina; Toldo, Luca; Hofmann-Apitius, Martin; Villoslada, Pablo

    2015-01-01

    In order to retrieve useful information from scientific literature and electronic medical records (EMR) we developed an ontology specific for Multiple Sclerosis (MS). The MS Ontology was created using scientific literature and expert review under the Protégé OWL environment. We developed a dictionary with semantic synonyms and translations to different languages for mining EMR. The MS Ontology was integrated with other ontologies and dictionaries (diseases/comorbidities, gene/protein, pathways, drug) into the text-mining tool SCAIView. We analyzed the EMRs from 624 patients with MS using the MS ontology dictionary in order to identify drug usage and comorbidities in MS. Testing competency questions and functional evaluation using F statistics further validated the usefulness of MS ontology. Validation of the lexicalized ontology by means of named entity recognition-based methods showed an adequate performance (F score = 0.73). The MS Ontology retrieved 80% of the genes associated with MS from scientific abstracts and identified additional pathways targeted by approved disease-modifying drugs (e.g. apoptosis pathways associated with mitoxantrone, rituximab and fingolimod). The analysis of the EMR from patients with MS identified current usage of disease modifying drugs and symptomatic therapy as well as comorbidities, which are in agreement with recent reports. The MS Ontology provides a semantic framework that is able to automatically extract information from both scientific literature and EMR from patients with MS, revealing new pathogenesis insights as well as new clinical information.

  1. An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence

    Science.gov (United States)

    Sahoo, Satya S.; Bodenreider, Olivier; Rutter, Joni L.; Skinner, Karen J.; Sheth, Amit P.

    2008-01-01

    Objectives This paper illustrates how Semantic Web technologies (especially RDF, OWL, and SPARQL) can support information integration and make it easy to create semantic mashups (semantically integrated resources). In the context of understanding the genetic basis of nicotine dependence, we integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. Methods We use an ontology-driven approach to integrate two gene resources (Entrez Gene and HomoloGene) and three pathway resources (KEGG, Reactome and BioCyc), for five organisms, including humans. We created the Entrez Knowledge Model (EKoM), an information model in OWL for the gene resources, and integrated it with the extant BioPAX ontology designed for pathway resources. The integrated schema is populated with data from the pathway resources, publicly available in BioPAX-compatible format, and gene resources for which a population procedure was created. The SPARQL query language is used to formulate queries over the integrated knowledge base to answer the three biological queries. Results Simple SPARQL queries could easily identify hub genes, i.e., those genes whose gene products participate in many pathways or interact with many other gene products. The identification of the genes expressed in the brain turned out to be more difficult, due to the lack of a common identification scheme for proteins. Conclusion Semantic Web technologies provide a valid framework for information integration in the life sciences. Ontology-driven integration represents a flexible, sustainable and extensible solution to the integration of large volumes of information. Additional resources, which enable the creation of mappings between information sources, are required to compensate for heterogeneity across namespaces. Resource page http://knoesis.wright.edu/research/lifesci/integration/structured_data/JBI-2008/ PMID:18395495

  2. OmniSearch: a semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data.

    Science.gov (United States)

    Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J; Dou, Dejing; Huang, Weili; Smith, Barry; Blake, Judith A; Eilbeck, Karen; Natale, Darren A; Lin, Yu; Wu, Bin; Silva, Nisansa de; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming; Ruttenberg, Alan

    2016-01-01

    As a special class of non-coding RNAs (ncRNAs), microRNAs (miRNAs) perform important roles in numerous biological and pathological processes. The realization of miRNA functions depends largely on how miRNAs regulate specific target genes. It is therefore critical to identify, analyze, and cross-reference miRNA-target interactions to better explore and delineate miRNA functions. Semantic technologies can help in this regard. We previously developed a miRNA domain-specific application ontology, Ontology for MIcroRNA Target (OMIT), whose goal was to serve as a foundation for semantic annotation, data integration, and semantic search in the miRNA field. In this paper we describe our continuing effort to develop the OMIT, and demonstrate its use within a semantic search system, OmniSearch, designed to facilitate knowledge capture of miRNA-target interaction data. Important changes in the current version OMIT are summarized as: (1) following a modularized ontology design (with 2559 terms imported from the NCRO ontology); (2) encoding all 1884 human miRNAs (vs. 300 in previous versions); and (3) setting up a GitHub project site along with an issue tracker for more effective community collaboration on the ontology development. The OMIT ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/omit.owl. The OmniSearch system is also free and open to all users, accessible at: http://omnisearch.soc.southalabama.edu/index.php/Software.

  3. Ontological Planning

    Directory of Open Access Journals (Sweden)

    Ahmet Alkan

    2017-12-01

    • Is it possible to redefine ontology within the hierarchical structure of planning? We are going to seek answers to some of these questions within the limited scope of this paper and we are going to offer the rest for discussion by just asking them. In light of these assessments, drawing attention, based on ontological knowledge relying on the wholeness of universe, to the question, on macro level planning, of whether or not the ontological realities of man, energy and movements of thinking can provide macro data for planning on a universal level as important factors affecting mankind will be one of the limited objectives of the paper.

  4. Comparative mapping reveals similar linkage of functional genes to ...

    Indian Academy of Sciences (India)

    genes between O. sativa and B. napus may have consistent function and control similar traits, which may be ..... acea chromosomes reveals islands of conserved organization. ... 1998 Conserved structure and function of the Arabidopsis flow-.

  5. Revealing gene action for production characteristics by inbreeding ...

    African Journals Online (AJOL)

    Revealing gene action for production characteristics by inbreeding, based on a long-term selection ... The gene action involved in the expression of production characters was investigated, using the effect of the theoretical inbreeding ..... and predicted selection responses for growth, fat and lean traits in mice. J. Anim. Sci.

  6. The Use of Gene Ontology Term and KEGG Pathway Enrichment for Analysis of Drug Half-Life.

    Directory of Open Access Journals (Sweden)

    Yu-Hang Zhang

    Full Text Available A drug's biological half-life is defined as the time required for the human body to metabolize or eliminate 50% of the initial drug dosage. Correctly measuring the half-life of a given drug is helpful for the safe and accurate usage of the drug. In this study, we investigated which gene ontology (GO terms and biological pathways were highly related to the determination of drug half-life. The investigated drugs, with known half-lives, were analyzed based on their enrichment scores for associated GO terms and KEGG pathways. These scores indicate which GO terms or KEGG pathways the drug targets. The feature selection method, minimum redundancy maximum relevance, was used to analyze these GO terms and KEGG pathways and to identify important GO terms and pathways, such as sodium-independent organic anion transmembrane transporter activity (GO:0015347, monoamine transmembrane transporter activity (GO:0008504, negative regulation of synaptic transmission (GO:0050805, neuroactive ligand-receptor interaction (hsa04080, serotonergic synapse (hsa04726, and linoleic acid metabolism (hsa00591, among others. This analysis confirmed our results and may show evidence for a new method in studying drug half-lives and building effective computational methods for the prediction of drug half-lives.

  7. Transcriptome sequencing of Mycosphaerella fijiensis during association with Musa acuminata reveals candidate pathogenicity genes.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-08-30

    Mycosphaerella fijiensis, causative agent of the black Sigatoka disease of banana, is considered the most economically damaging banana disease. Despite its importance, the genetics of pathogenicity are poorly understood. Previous studies have characterized polyketide pathways with possible roles in pathogenicity. To identify additional candidate pathogenicity genes, we compared the transcriptome of this fungus during the necrotrophic phase of infection with that during saprophytic growth in medium. Transcriptome analysis was conducted, and the functions of differentially expressed genes were predicted by identifying conserved domains, Gene Ontology (GO) annotation and GO enrichment analysis, Carbohydrate-Active EnZymes (CAZy) annotation, and identification of genes encoding effector-like proteins. The analysis showed that genes commonly involved in secondary metabolism have higher expression in infected leaf tissue, including genes encoding cytochrome P450s, short-chain dehydrogenases, and oxidoreductases in the 2-oxoglutarate and Fe(II)-dependent oxygenase superfamily. Other pathogenicity-related genes with higher expression in infected leaf tissue include genes encoding salicylate hydroxylase-like proteins, hydrophobic surface binding proteins, CFEM domain-containing proteins, and genes encoding secreted cysteine-rich proteins characteristic of effectors. More genes encoding amino acid transporters, oligopeptide transporters, peptidases, proteases, proteinases, sugar transporters, and proteins containing Domain of Unknown Function (DUF) 3328 had higher expression in infected leaf tissue, while more genes encoding inhibitors of peptidases and proteinases had higher expression in medium. Sixteen gene clusters with higher expression in leaf tissue were identified including clusters for the synthesis of a non-ribosomal peptide. A cluster encoding a novel fusicoccane was also identified. Two putative dispensable scaffolds were identified with a large proportion of

  8. Lentiviral gene ontology (LeGO) vectors equipped with novel drug-selectable fluorescent proteins: new building blocks for cell marking and multi-gene analysis.

    Science.gov (United States)

    Weber, K; Mock, U; Petrowitz, B; Bartsch, U; Fehse, B

    2010-04-01

    Vector-encoded fluorescent proteins (FPs) facilitate unambiguous identification or sorting of gene-modified cells by fluorescence-activated cell sorting (FACS). Exploiting this feature, we have recently developed lentiviral gene ontology (LeGO) vectors (www.LentiGO-Vectors.de) for multi-gene analysis in different target cells. In this study, we extend the LeGO principle by introducing 10 different drug-selectable FPs created by fusing one of the five selection marker (protecting against blasticidin, hygromycin, neomycin, puromycin and zeocin) and one of the five FP genes (Cerulean, eGFP, Venus, dTomato and mCherry). All tested fusion proteins allowed both fluorescence-mediated detection and drug-mediated selection of LeGO-transduced cells. Newly generated codon-optimized hygromycin- and neomycin-resistance genes showed improved expression as compared with their ancestors. New LeGO constructs were produced at titers >10(6) per ml (for non-concentrated supernatants). We show efficient combinatorial marking and selection of various cells, including mesenchymal stem cells, simultaneously transduced with different LeGO constructs. Inclusion of the cytomegalovirus early enhancer/chicken beta-actin promoter into LeGO vectors facilitated robust transgene expression in and selection of neural stem cells and their differentiated progeny. We suppose that the new drug-selectable markers combining advantages of FACS and drug selection are well suited for numerous applications and vector systems. Their inclusion into LeGO vectors opens new possibilities for (stem) cell tracking and functional multi-gene analysis.

  9. The use of semantic similarity measures for optimally integrating heterogeneous Gene Ontology data from large scale annotation pipelines

    Directory of Open Access Journals (Sweden)

    Gaston K Mazandu

    2014-08-01

    Full Text Available With the advancement of new high throughput sequencing technologies, there has been an increase in the number of genome sequencing projects worldwide, which has yielded complete genome sequences of human, animals and plants. Subsequently, several labs have focused on genome annotation, consisting of assigning functions to gene products, mostly using Gene Ontology (GO terms. As a consequence, there is an increased heterogeneity in annotations across genomes due to different approaches used by different pipelines to infer these annotations and also due to the nature of the GO structure itself. This makes a curator's task difficult, even if they adhere to the established guidelines for assessing these protein annotations. Here we develop a genome-scale approach for integrating GO annotations from different pipelines using semantic similarity measures. We used this approach to identify inconsistencies and similarities in functional annotations between orthologs of human and Drosophila melanogaster, to assess the quality of GO annotations derived from InterPro2GO mappings compared to manually annotated GO annotations for the Drosophila melanogaster proteome from a FlyBase dataset and human, and to filter GO annotation data for these proteomes. Results obtained indicate that an efficient integration of GO annotations eliminates redundancy up to 27.08 and 22.32% in the Drosophila melanogaster and human GO annotation datasets, respectively. Furthermore, we identified lack of and missing annotations for some orthologs, and annotation mismatches between InterPro2GO and manual pipelines in these two proteomes, thus requiring further curation. This simplifies and facilitates tasks of curators in assessing protein annotations, reduces redundancy and eliminates inconsistencies in large annotation datasets for ease of comparative functional genomics.

  10. [Using (1)H-nuclear magnetic resonance metabolomics and gene ontology to establish pathological staging model for esophageal cancer patients].

    Science.gov (United States)

    Chen, X; Wang, K; Chen, W; Jiang, H; Deng, P C; Li, Z J; Peng, J; Zhou, Z Y; Yang, H; Huang, G X; Zeng, J

    2016-07-01

    (ethanol amine, hydroxy-propionic acid, homocysteine and estriol) were eventually selected. gene ontology analysis showed that 54 enzymes and genes regulated the 4 key metabolic markers. The quantitative prediction model of esophageal cancer staging based on esophageal cancer NMR spectrum were established. Cross-validation results showed that the predicted effect was good (root mean square error=5.3, R(2)=0.47, P=0.036). The systems biology approaches based on metabolomics and enzyme-gene regulatory network analysis can be used to quantify the metabolic network disturbance of patients with advanced esophageal cancer, and to predict preoperative clinical staging of esophageal cancer patients by plasma NMR metabolomics.

  11. Analysis of global gene expression in Brachypodium distachyon reveals extensive network plasticity in response to abiotic stress.

    Directory of Open Access Journals (Sweden)

    Henry D Priest

    Full Text Available Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.

  12. Delineation and interpretation of gene networks towards their effect in cellular physiology- a reverse engineering approach for the identification of critical molecular players, through the use of ontologies.

    Science.gov (United States)

    Moutselos, K; Maglogiannis, I; Chatziioannou, A

    2010-01-01

    Exploiting ontologies, provides clues regarding the involvement of certain molecular processes in the cellular phenotypic manifestation. However, identifying individual molecular actors (genes, proteins, etc.) for targeted biological validation in a generic, prioritized, fashion, based in objective measures of their effects in the cellular physiology, remains a challenge. In this work, a new meta-analysis algorithm is proposed for the holistic interpretation of the information captured in -omic experiments, that is showcased in a transcriptomic, dynamic, DNA microarray dataset, which examines the effect of mastic oil treatment in Lewis lung carcinoma cells. Through the use of the Gene Ontology this algorithm relates genes to specific cellular pathways and vice versa in order to further reverse engineer the critical role of specific genes, starting from the results of various statistical enrichment analyses. The algorithm is able to discriminate candidate hub-genes, implying critical biochemical cross-talk. Moreover, performance measures of the algorithm are derived, when evaluated with respect to the differential expression gene list of the dataset.

  13. SUGOI: automated ontology interchangeability

    CSIR Research Space (South Africa)

    Khan, ZC

    2015-04-01

    Full Text Available A foundational ontology can solve interoperability issues among the domain ontologies aligned to it. However, several foundational ontologies have been developed, hence such interoperability issues exist among domain ontologies. The novel SUGOI tool...

  14. Inferring ontology graph structures using OWL reasoning

    KAUST Repository

    Rodriguez-Garcia, Miguel Angel

    2018-01-05

    Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies\\' semantic content remains a challenge.We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies\\' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph .Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.

  15. Inferring ontology graph structures using OWL reasoning.

    Science.gov (United States)

    Rodríguez-García, Miguel Ángel; Hoehndorf, Robert

    2018-01-05

    Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies' semantic content remains a challenge. We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph . Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.

  16. Ontology evolution in physics

    OpenAIRE

    Chan, Michael

    2013-01-01

    With the advent of reasoning problems in dynamic environments, there is an increasing need for automated reasoning systems to automatically adapt to unexpected changes in representations. In particular, the automation of the evolution of their ontologies needs to be enhanced without substantially sacrificing expressivity in the underlying representation. Revision of beliefs is not enough, as adding to or removing from beliefs does not change the underlying formal language. Gene...

  17. Analysis of the robustness of network-based disease-gene prioritization methods reveals redundancy in the human interactome and functional diversity of disease-genes.

    Directory of Open Access Journals (Sweden)

    Emre Guney

    Full Text Available Complex biological systems usually pose a trade-off between robustness and fragility where a small number of perturbations can substantially disrupt the system. Although biological systems are robust against changes in many external and internal conditions, even a single mutation can perturb the system substantially, giving rise to a pathophenotype. Recent advances in identifying and analyzing the sequential variations beneath human disorders help to comprehend a systemic view of the mechanisms underlying various disease phenotypes. Network-based disease-gene prioritization methods rank the relevance of genes in a disease under the hypothesis that genes whose proteins interact with each other tend to exhibit similar phenotypes. In this study, we have tested the robustness of several network-based disease-gene prioritization methods with respect to the perturbations of the system using various disease phenotypes from the Online Mendelian Inheritance in Man database. These perturbations have been introduced either in the protein-protein interaction network or in the set of known disease-gene associations. As the network-based disease-gene prioritization methods are based on the connectivity between known disease-gene associations, we have further used these methods to categorize the pathophenotypes with respect to the recoverability of hidden disease-genes. Our results have suggested that, in general, disease-genes are connected through multiple paths in the human interactome. Moreover, even when these paths are disturbed, network-based prioritization can reveal hidden disease-gene associations in some pathophenotypes such as breast cancer, cardiomyopathy, diabetes, leukemia, parkinson disease and obesity to a greater extend compared to the rest of the pathophenotypes tested in this study. Gene Ontology (GO analysis highlighted the role of functional diversity for such diseases.

  18. Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium.

    Directory of Open Access Journals (Sweden)

    Fengxi Yang

    Full Text Available Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms

  19. RNA-Seq reveals seven promising candidate genes affecting the proportion of thick egg albumen in layer-type chickens.

    Science.gov (United States)

    Wan, Yi; Jin, Sihua; Ma, Chendong; Wang, Zhicheng; Fang, Qi; Jiang, Runshen

    2017-12-22

    Eggs with a much higher proportion of thick albumen are preferred in the layer industry, as they are favoured by consumers. However, the genetic factors affecting the thick egg albumen trait have not been elucidated. Using RNA sequencing, we explored the magnum transcriptome in 9 Rhode Island white layers: four layers with phenotypes of extremely high ratios of thick to thin albumen (high thick albumen, HTA) and five with extremely low ratios (low thick albumen, LTA). A total of 220 genes were differentially expressed, among which 150 genes were up-regulated and 70 were down-regulated in the HTA group compared with the LTA group. Gene Ontology (GO) analysis revealed that the up-regulated genes in HTA were mainly involved in a wide range of regulatory functions. In addition, a large number of these genes were related to glycosphingolipid biosynthesis, focal adhesion, ECM-receptor interactions and cytokine-cytokine receptor interactions. Based on functional analysis, ST3GAL4, FUT4, ITGA2, SDC3, PRLR, CDH4 and GALNT9 were identified as promising candidate genes for thick albumen synthesis and metabolism during egg formation. These results provide new insights into the molecular mechanisms of egg albumen traits and may contribute to future breeding strategies that optimise the proportion of thick egg albumen.

  20. Building ontologies with basic formal ontology

    CERN Document Server

    Arp, Robert; Spear, Andrew D.

    2015-01-01

    In the era of "big data," science is increasingly information driven, and the potential for computers to store, manage, and integrate massive amounts of data has given rise to such new disciplinary fields as biomedical informatics. Applied ontology offers a strategy for the organization of scientific information in computer-tractable form, drawing on concepts not only from computer and information science but also from linguistics, logic, and philosophy. This book provides an introduction to the field of applied ontology that is of particular relevance to biomedicine, covering theoretical components of ontologies, best practices for ontology design, and examples of biomedical ontologies in use. After defining an ontology as a representation of the types of entities in a given domain, the book distinguishes between different kinds of ontologies and taxonomies, and shows how applied ontology draws on more traditional ideas from metaphysics. It presents the core features of the Basic Formal Ontology (BFO), now u...

  1. Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2005-09-01

    Full Text Available Abstract Background Biological processes are carried out by coordinated modules of interacting molecules. As clustering methods demonstrate that genes with similar expression display increased likelihood of being associated with a common functional module, networks of coexpressed genes provide one framework for assigning gene function. This has informed the guilt-by-association (GBA heuristic, widely invoked in functional genomics. Yet although the idea of GBA is accepted, the breadth of GBA applicability is uncertain. Results We developed methods to systematically explore the breadth of GBA across a large and varied corpus of expression data to answer the following question: To what extent is the GBA heuristic broadly applicable to the transcriptome and conversely how broadly is GBA captured by a priori knowledge represented in the Gene Ontology (GO? Our study provides an investigation of the functional organization of five coexpression networks using data from three mammalian organisms. Our method calculates a probabilistic score between each gene and each Gene Ontology category that reflects coexpression enrichment of a GO module. For each GO category we use Receiver Operating Curves to assess whether these probabilistic scores reflect GBA. This methodology applied to five different coexpression networks demonstrates that the signature of guilt-by-association is ubiquitous and reproducible and that the GBA heuristic is broadly applicable across the population of nine hundred Gene Ontology categories. We also demonstrate the existence of highly reproducible patterns of coexpression between some pairs of GO categories. Conclusion We conclude that GBA has universal value and that transcriptional control may be more modular than previously realized. Our analyses also suggest that methodologies combining coexpression measurements across multiple genes in a biologically-defined module can aid in characterizing gene function or in characterizing

  2. The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability.

    Science.gov (United States)

    Diehl, Alexander D; Meehan, Terrence F; Bradford, Yvonne M; Brush, Matthew H; Dahdul, Wasila M; Dougall, David S; He, Yongqun; Osumi-Sutherland, David; Ruttenberg, Alan; Sarntivijai, Sirarat; Van Slyke, Ceri E; Vasilevsky, Nicole A; Haendel, Melissa A; Blake, Judith A; Mungall, Christopher J

    2016-07-04

    The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies. Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning. The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs. The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the

  3. GOASVM: a subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2013-04-21

    Prediction of protein subcellular localization is an important yet challenging problem. Recently, several computational methods based on Gene Ontology (GO) have been proposed to tackle this problem and have demonstrated superiority over methods based on other features. Existing GO-based methods, however, do not fully use the GO information. This paper proposes an efficient GO method called GOASVM that exploits the information from the GO term frequencies and distant homologs to represent a protein in the general form of Chou's pseudo-amino acid composition. The method first selects a subset of relevant GO terms to form a GO vector space. Then for each protein, the method uses the accession number (AC) of the protein or the ACs of its homologs to find the number of occurrences of the selected GO terms in the Gene Ontology annotation (GOA) database as a means to construct GO vectors for support vector machines (SVMs) classification. With the advantages of GO term frequencies and a new strategy to incorporate useful homologous information, GOASVM can achieve a prediction accuracy of 72.2% on a new independent test set comprising novel proteins that were added to Swiss-Prot six years later than the creation date of the training set. GOASVM and Supplementary materials are available online at http://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/GOASVM.html. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Ontology authoring with Forza

    CSIR Research Space (South Africa)

    Keet, CM

    2014-11-01

    Full Text Available Generic, reusable ontology elements, such as a foundational ontology's categories and part-whole relations, are essential for good and interoperable knowledge representation. Ontology developers, which include domain experts and novices, face...

  5. Ontological Surprises

    DEFF Research Database (Denmark)

    Leahu, Lucian

    2016-01-01

    a hybrid approach where machine learning algorithms are used to identify objects as well as connections between them; finally, it argues for remaining open to ontological surprises in machine learning as they may enable the crafting of different relations with and through technologies.......This paper investigates how we might rethink design as the technological crafting of human-machine relations in the context of a machine learning technique called neural networks. It analyzes Google’s Inceptionism project, which uses neural networks for image recognition. The surprising output...

  6. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  7. Gene Expression Profiling Reveals Potential Players of Left-Right Asymmetry in Female Chicken Gonads

    Directory of Open Access Journals (Sweden)

    Zhiyi Wan

    2017-06-01

    Full Text Available Most female birds develop only a left ovary, whereas males develop bilateral testes. The mechanism underlying this process is still not completely understood. Here, we provide a comprehensive transcriptional analysis of female chicken gonads and identify novel candidate side-biased genes. RNA-Seq analysis was carried out on total RNA harvested from the left and right gonads on embryonic day 6 (E6, E12, and post-hatching day 1 (D1. By comparing the gene expression profiles between the left and right gonads, 347 differentially expressed genes (DEGs were obtained on E6, 3730 were obtained on E12, and 2787 were obtained on D1. Side-specific genes were primarily derived from the autosome rather than the sex chromosome. Gene ontology and pathway analysis showed that the DEGs were most enriched in the Piwi-interactiing RNA (piRNA metabolic process, germ plasm, chromatoid body, P granule, neuroactive ligand-receptor interaction, microbial metabolism in diverse environments, and methane metabolism. A total of 111 DEGs, five gene ontology (GO terms, and three pathways were significantly different between the left and right gonads among all the development stages. We also present the gene number and the percentage within eight development-dependent expression patterns of DEGs in the left and right gonads of female chicken.

  8. Gene Expression Profiling Reveals Potential Players of Left-Right Asymmetry in Female Chicken Gonads.

    Science.gov (United States)

    Wan, Zhiyi; Lu, Yanan; Rui, Lei; Yu, Xiaoxue; Yang, Fang; Tu, Chengfang; Li, Zandong

    2017-06-20

    Most female birds develop only a left ovary, whereas males develop bilateral testes. The mechanism underlying this process is still not completely understood. Here, we provide a comprehensive transcriptional analysis of female chicken gonads and identify novel candidate side-biased genes. RNA-Seq analysis was carried out on total RNA harvested from the left and right gonads on embryonic day 6 (E6), E12, and post-hatching day 1 (D1). By comparing the gene expression profiles between the left and right gonads, 347 differentially expressed genes (DEGs) were obtained on E6, 3730 were obtained on E12, and 2787 were obtained on D1. Side-specific genes were primarily derived from the autosome rather than the sex chromosome. Gene ontology and pathway analysis showed that the DEGs were most enriched in the Piwi-interactiing RNA (piRNA) metabolic process, germ plasm, chromatoid body, P granule, neuroactive ligand-receptor interaction, microbial metabolism in diverse environments, and methane metabolism. A total of 111 DEGs, five gene ontology (GO) terms, and three pathways were significantly different between the left and right gonads among all the development stages. We also present the gene number and the percentage within eight development-dependent expression patterns of DEGs in the left and right gonads of female chicken.

  9. CRISPR loci reveal networks of gene exchange in archaea

    Directory of Open Access Journals (Sweden)

    Brodt Avital

    2011-12-01

    Full Text Available Abstract Background CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Results Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. Conclusions CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. Open peer review This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten

  10. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  11. CRISPR loci reveal networks of gene exchange in archaea.

    Science.gov (United States)

    Brodt, Avital; Lurie-Weinberger, Mor N; Gophna, Uri

    2011-12-21

    CRISPR (Clustered, Regularly, Interspaced, Short, Palindromic Repeats) loci provide prokaryotes with an adaptive immunity against viruses and other mobile genetic elements. CRISPR arrays can be transcribed and processed into small crRNA molecules, which are then used by the cell to target the foreign nucleic acid. Since spacers are accumulated by active CRISPR/Cas systems, the sequences of these spacers provide a record of the past "infection history" of the organism. Here we analyzed all currently known spacers present in archaeal genomes and identified their source by DNA similarity. While nearly 50% of archaeal spacers matched mobile genetic elements, such as plasmids or viruses, several others matched chromosomal genes of other organisms, primarily other archaea. Thus, networks of gene exchange between archaeal species were revealed by the spacer analysis, including many cases of inter-genus and inter-species gene transfer events. Spacers that recognize viral sequences tend to be located further away from the leader sequence, implying that there exists a selective pressure for their retention. CRISPR spacers provide direct evidence for extensive gene exchange in archaea, especially within genera, and support the current dogma where the primary role of the CRISPR/Cas system is anti-viral and anti-plasmid defense. This article was reviewed by: Profs. W. Ford Doolittle, John van der Oost, Christa Schleper (nominated by board member Prof. J Peter Gogarten).

  12. Novel candidate genes important for asthma and hypertension comorbidity revealed from associative gene networks.

    Science.gov (United States)

    Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A

    2018-02-13

    Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in

  13. Gene expression profiling, pathway analysis and subtype classification reveal molecular heterogeneity in hepatocellular carcinoma and suggest subtype specific therapeutic targets.

    Science.gov (United States)

    Agarwal, Rahul; Narayan, Jitendra; Bhattacharyya, Amitava; Saraswat, Mayank; Tomar, Anil Kumar

    2017-10-01

    A very low 5-year survival rate among hepatocellular carcinoma (HCC) patients is mainly due to lack of early stage diagnosis, distant metastasis and high risk of postoperative recurrence. Hence ascertaining novel biomarkers for early diagnosis and patient specific therapeutics is crucial and urgent. Here, we have performed a comprehensive analysis of the expression data of 423 HCC patients (373 tumors and 50 controls) downloaded from The Cancer Genome Atlas (TCGA) followed by pathway enrichment by gene ontology annotations, subtype classification and overall survival analysis. The differential gene expression analysis using non-parametric Wilcoxon test revealed a total of 479 up-regulated and 91 down-regulated genes in HCC compared to controls. The list of top differentially expressed genes mainly consists of tumor/cancer associated genes, such as AFP, THBS4, LCN2, GPC3, NUF2, etc. The genes over-expressed in HCC were mainly associated with cell cycle pathways. In total, 59 kinases associated genes were found over-expressed in HCC, including TTK, MELK, BUB1, NEK2, BUB1B, AURKB, PLK1, CDK1, PKMYT1, PBK, etc. Overall four distinct HCC subtypes were predicted using consensus clustering method. Each subtype was unique in terms of gene expression, pathway enrichment and median survival. Conclusively, this study has exposed a number of interesting genes which can be exploited in future as potential markers of HCC, diagnostic as well as prognostic and subtype classification may guide for improved and specific therapy. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Anatomy Ontology Matching Using Markov Logic Networks

    Directory of Open Access Journals (Sweden)

    Chunhua Li

    2016-01-01

    Full Text Available The anatomy of model species is described in ontologies, which are used to standardize the annotations of experimental data, such as gene expression patterns. To compare such data between species, we need to establish relationships between ontologies describing different species. Ontology matching is a kind of solutions to find semantic correspondences between entities of different ontologies. Markov logic networks which unify probabilistic graphical model and first-order logic provide an excellent framework for ontology matching. We combine several different matching strategies through first-order logic formulas according to the structure of anatomy ontologies. Experiments on the adult mouse anatomy and the human anatomy have demonstrated the effectiveness of proposed approach in terms of the quality of result alignment.

  15. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk

    OpenAIRE

    Cheng, Liang; Jiang, Yue; Ju, Hong; Sun, Jie; Peng, Jiajie; Zhou, Meng; Hu, Yang

    2018-01-01

    Background Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each o...

  16. Using Gene Ontology to describe the role of the neurexin-neuroligin-SHANK complex in human, mouse and rat and its relevance to autism.

    Science.gov (United States)

    Patel, Sejal; Roncaglia, Paola; Lovering, Ruth C

    2015-06-06

    People with an autistic spectrum disorder (ASD) display a variety of characteristic behavioral traits, including impaired social interaction, communication difficulties and repetitive behavior. This complex neurodevelopment disorder is known to be associated with a combination of genetic and environmental factors. Neurexins and neuroligins play a key role in synaptogenesis and neurexin-neuroligin adhesion is one of several processes that have been implicated in autism spectrum disorders. In this report we describe the manual annotation of a selection of gene products known to be associated with autism and/or the neurexin-neuroligin-SHANK complex and demonstrate how a focused annotation approach leads to the creation of more descriptive Gene Ontology (GO) terms, as well as an increase in both the number of gene product annotations and their granularity, thus improving the data available in the GO database. The manual annotations we describe will impact on the functional analysis of a variety of future autism-relevant datasets. Comprehensive gene annotation is an essential aspect of genomic and proteomic studies, as the quality of gene annotations incorporated into statistical analysis tools affects the effective interpretation of data obtained through genome wide association studies, next generation sequencing, proteomic and transcriptomic datasets.

  17. Didactical Ontologies

    Directory of Open Access Journals (Sweden)

    Steffen Mencke, Reiner Dumke

    2008-03-01

    Full Text Available Ontologies are a fundamental concept of theSemantic Web envisioned by Tim Berners-Lee [1]. Togetherwith explicit representation of the semantics of data formachine-accessibility such domain theories are the basis forintelligent next generation applications for the web andother areas of interest [2]. Their application for specialaspects within the domain of e-learning is often proposed tosupport the increasing complexity ([3], [4], [5], [6]. So theycan provide a better support for course generation orlearning scenario description [7]. By the modeling ofdidactics-related expertise and their provision for thecreators of courses many improvements like reuse, rapiddevelopment and of course increased learning performancebecome possible due to the separation from other aspects ofe-learning platforms as already proposed in [8].

  18. Gene expression profiling in susceptible interaction of grapevine with its fungal pathogen Eutypa lata: Extending MapMan ontology for grapevine

    Directory of Open Access Journals (Sweden)

    Usadel Björn

    2009-08-01

    Full Text Available Abstract Background Whole genome transcriptomics analysis is a very powerful approach because it gives an overview of the activity of genes in certain cells or tissue types. However, biological interpretation of such results can be rather tedious. MapMan is a software tool that displays large datasets (e.g. gene expression data onto diagrams of metabolic pathways or other processes and thus enables easier interpretation of results. The grapevine (Vitis vinifera genome sequence has recently become available bringing a new dimension into associated research. Two microarray platforms were designed based on the TIGR Gene Index database and used in several physiological studies. Results To enable easy and effective visualization of those and further experiments, annotation of Vitis vinifera Gene Index (VvGI version 5 to MapMan ontology was set up. Due to specificities of grape physiology, we have created new pictorial representations focusing on three selected pathways: carotenoid pathway, terpenoid pathway and phenylpropanoid pathway, the products of these pathways being important for wine aroma, flavour and colour, as well as plant defence against pathogens. This new tool was validated on Affymetrix microarrays data obtained during berry ripening and it allowed the discovery of new aspects in process regulation. We here also present results on transcriptional profiling of grape plantlets after exposal to the fungal pathogen Eutypa lata using Operon microarrays including visualization of results with MapMan. The data show that the genes induced in infected plants, encode pathogenesis related proteins and enzymes of the flavonoid metabolism, which are well known as being responsive to fungal infection. Conclusion The extension of MapMan ontology to grapevine together with the newly constructed pictorial representations for carotenoid, terpenoid and phenylpropanoid metabolism provide an alternative approach to the analysis of grapevine gene expression

  19. Markov Chain Ontology Analysis (MCOA).

    Science.gov (United States)

    Frost, H Robert; McCray, Alexa T

    2012-02-03

    Biomedical ontologies have become an increasingly critical lens through which researchers analyze the genomic, clinical and bibliographic data that fuels scientific research. Of particular relevance are methods, such as enrichment analysis, that quantify the importance of ontology classes relative to a collection of domain data. Current analytical techniques, however, remain limited in their ability to handle many important types of structural complexity encountered in real biological systems including class overlaps, continuously valued data, inter-instance relationships, non-hierarchical relationships between classes, semantic distance and sparse data. In this paper, we describe a methodology called Markov Chain Ontology Analysis (MCOA) and illustrate its use through a MCOA-based enrichment analysis application based on a generative model of gene activation. MCOA models the classes in an ontology, the instances from an associated dataset and all directional inter-class, class-to-instance and inter-instance relationships as a single finite ergodic Markov chain. The adjusted transition probability matrix for this Markov chain enables the calculation of eigenvector values that quantify the importance of each ontology class relative to other classes and the associated data set members. On both controlled Gene Ontology (GO) data sets created with Escherichia coli, Drosophila melanogaster and Homo sapiens annotations and real gene expression data extracted from the Gene Expression Omnibus (GEO), the MCOA enrichment analysis approach provides the best performance of comparable state-of-the-art methods. A methodology based on Markov chain models and network analytic metrics can help detect the relevant signal within large, highly interdependent and noisy data sets and, for applications such as enrichment analysis, has been shown to generate superior performance on both real and simulated data relative to existing state-of-the-art approaches.

  20. Multilocus analysis reveals three candidate genes for Chinese migraine susceptibility.

    Science.gov (United States)

    An, X-K; Fang, J; Yu, Z-Z; Lin, Q; Lu, C-X; Qu, H-L; Ma, Q-L

    2017-08-01

    Several genome-wide association studies (GWASs) in Caucasian populations have identified 12 loci that are significantly associated with migraine. More evidence suggests that serotonin receptors are also involved in migraine pathophysiology. In the present study, a case-control study was conducted in a cohort of 581 migraine cases and 533 ethnically matched controls among a Chinese population. Eighteen polymorphisms from serotonin receptors and GWASs were selected, and genotyping was performed using a Sequenom MALDI-TOF mass spectrometry iPLEX platform. The genotypic and allelic distributions of MEF2D rs2274316 and ASTN2 rs6478241 were significantly different between migraine patients and controls. Univariate and multivariate analysis revealed significant associations of polymorphisms in the MEF2D and ASTN2 genes with migraine susceptibility. MEF2D, PRDM16 and ASTN2 were also found to be associated with migraine without aura (MO) and migraine with family history. And, MEF2D and ASTN2 also served as genetic risk factors for the migraine without family history. The generalized multifactor dimensionality reduction analysis identified that MEF2D and HTR2E constituted the two-factor interaction model. Our study suggests that the MEF2D, PRDM16 and ASTN2 genes from GWAS are associated with migraine susceptibility, especially MO, among Chinese patients. It appears that there is no association with serotonin receptor related genes. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  1. Leave-two-out stability of ontology learning algorithm

    International Nuclear Information System (INIS)

    Wu, Jianzhang; Yu, Xiao; Zhu, Linli; Gao, Wei

    2016-01-01

    Ontology is a semantic analysis and calculation model, which has been applied to many subjects. Ontology similarity calculation and ontology mapping are employed as machine learning approaches. The purpose of this paper is to study the leave-two-out stability of ontology learning algorithm. Several leave-two-out stabilities are defined in ontology learning setting and the relationship among these stabilities are presented. Furthermore, the results manifested reveal that leave-two-out stability is a sufficient and necessary condition for ontology learning algorithm.

  2. High throughput analysis reveals dissociable gene expression profiles in two independent neural systems involved in the regulation of social behavior

    Directory of Open Access Journals (Sweden)

    Stevenson Tyler J

    2012-10-01

    Full Text Available Abstract Background Production of contextually appropriate social behaviors involves integrated activity across many brain regions. Many songbird species produce complex vocalizations called ‘songs’ that serve to attract potential mates, defend territories, and/or maintain flock cohesion. There are a series of discrete interconnect brain regions that are essential for the successful production of song. The probability and intensity of singing behavior is influenced by the reproductive state. The objectives of this study were to examine the broad changes in gene expression in brain regions that control song production with a brain region that governs the reproductive state. Results We show using microarray cDNA analysis that two discrete brain systems that are both involved in governing singing behavior show markedly different gene expression profiles. We found that cortical and basal ganglia-like brain regions that control the socio-motor production of song in birds exhibit a categorical switch in gene expression that was dependent on their reproductive state. This pattern is in stark contrast to the pattern of expression observed in a hypothalamic brain region that governs the neuroendocrine control of reproduction. Subsequent gene ontology analysis revealed marked variation in the functional categories of active genes dependent on reproductive state and anatomical localization. HVC, one cortical-like structure, displayed significant gene expression changes associated with microtubule and neurofilament cytoskeleton organization, MAP kinase activity, and steroid hormone receptor complex activity. The transitions observed in the preoptic area, a nucleus that governs the motivation to engage in singing, exhibited variation in functional categories that included thyroid hormone receptor activity, epigenetic and angiogenetic processes. Conclusions These findings highlight the importance of considering the temporal patterns of gene expression

  3. Memory functions reveal structural properties of gene regulatory networks

    Science.gov (United States)

    Perez-Carrasco, Ruben

    2018-01-01

    Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs. PMID:29470492

  4. Bioinformatics Analysis Reveals Genes Involved in the Pathogenesis of Ameloblastoma and Keratocystic Odontogenic Tumor.

    Science.gov (United States)

    Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição

    2016-01-01

    Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (Preview data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.

  5. Genome Wide Association Analysis Reveals New Production Trait Genes in a Male Duroc Population.

    Directory of Open Access Journals (Sweden)

    Kejun Wang

    Full Text Available In this study, 796 male Duroc pigs were used to identify genomic regions controlling growth traits. Three production traits were studied: food conversion ratio, days to 100 KG, and average daily gain, using a panel of 39,436 single nucleotide polymorphisms. In total, we detected 11 genome-wide and 162 chromosome-wide single nucleotide polymorphism trait associations. The Gene ontology analysis identified 14 candidate genes close to significant single nucleotide polymorphisms, with growth-related functions: six for days to 100 KG (WT1, FBXO3, DOCK7, PPP3CA, AGPAT9, and NKX6-1, seven for food conversion ratio (MAP2, TBX15, IVL, ARL15, CPS1, VWC2L, and VAV3, and one for average daily gain (COL27A1. Gene ontology analysis indicated that most of the candidate genes are involved in muscle, fat, bone or nervous system development, nutrient absorption, and metabolism, which are all either directly or indirectly related to growth traits in pigs. Additionally, we found four haplotype blocks composed of suggestive single nucleotide polymorphisms located in the growth trait-related quantitative trait loci and further narrowed down the ranges, the largest of which decreased by ~60 Mb. Hence, our results could be used to improve pig production traits by increasing the frequency of favorable alleles via artificial selection.

  6. Comparing Relational and Ontological Triple Stores in Healthcare Domain

    Directory of Open Access Journals (Sweden)

    Ozgu Can

    2017-01-01

    Full Text Available Today’s technological improvements have made ubiquitous healthcare systems that converge into smart healthcare applications in order to solve patients’ problems, to communicate effectively with patients, and to improve healthcare service quality. The first step of building a smart healthcare information system is representing the healthcare data as connected, reachable, and sharable. In order to achieve this representation, ontologies are used to describe the healthcare data. Combining ontological healthcare data with the used and obtained data can be maintained by storing the entire health domain data inside big data stores that support both relational and graph-based ontological data. There are several big data stores and different types of big data sets in the healthcare domain. The goal of this paper is to determine the most applicable ontology data store for storing the big healthcare data. For this purpose, AllegroGraph and Oracle 12c data stores are compared based on their infrastructural capacity, loading time, and query response times. Hence, healthcare ontologies (GENE Ontology, Gene Expression Ontology (GEXO, Regulation of Transcription Ontology (RETO, Regulation of Gene Expression Ontology (REXO are used to measure the ontology loading time. Thereafter, various queries are constructed and executed for GENE ontology in order to measure the capacity and query response times for the performance comparison between AllegroGraph and Oracle 12c triple stores.

  7. Construction of ontology augmented networks for protein complex prediction.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

    2013-01-01

    Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.

  8. XML, Ontologies, and Their Clinical Applications.

    Science.gov (United States)

    Yu, Chunjiang; Shen, Bairong

    2016-01-01

    The development of information technology has resulted in its penetration into every area of clinical research. Various clinical systems have been developed, which produce increasing volumes of clinical data. However, saving, exchanging, querying, and exploiting these data are challenging issues. The development of Extensible Markup Language (XML) has allowed the generation of flexible information formats to facilitate the electronic sharing of structured data via networks, and it has been used widely for clinical data processing. In particular, XML is very useful in the fields of data standardization, data exchange, and data integration. Moreover, ontologies have been attracting increased attention in various clinical fields in recent years. An ontology is the basic level of a knowledge representation scheme, and various ontology repositories have been developed, such as Gene Ontology and BioPortal. The creation of these standardized repositories greatly facilitates clinical research in related fields. In this chapter, we discuss the basic concepts of XML and ontologies, as well as their clinical applications.

  9. Semantic similarity between ontologies at different scales

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Qingpeng; Haglin, David J.

    2016-04-01

    In the past decade, existing and new knowledge and datasets has been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three Gene Ontology slims (Plant, Yeast, and Candida, among which the latter two belong to the same kingdom—Fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performance of Jiang-Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by (a) consistently showing that Yeast and Candida are more similar (as compared to Plant) at different scales, and (b) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.

  10. Comparative Transcriptomics Reveals Differential Gene Expression Related to Colletotrichum gloeosporioides Resistance in the Octoploid Strawberry

    Directory of Open Access Journals (Sweden)

    Feng Wang

    2017-05-01

    Full Text Available The strawberry is an important fruit worldwide; however, the development of the strawberry industry is limited by fungal disease. Anthracnose is caused by the pathogen Colletotrichum gloeosporioides and leads to large-scale losses in strawberry quality and production. However, the transcriptional response of strawberry to infection with C. gloeosporioides is poorly understood. In the present study, the strawberry leaf transcriptome of the ‘Yanli’ and ‘Benihoppe’ cultivars were deep sequenced via an RNA-seq analysis to study C. gloeosporioides resistance in strawberry. Among the sequences, differentially expressed genes were annotated with Gene Ontology terms and subjected to pathway enrichment analysis. Significant categories included defense, plant–pathogen interactions and flavonoid biosynthesis were identified. The comprehensive transcriptome data set provides molecular insight into C. gloeosporioides resistance genes in resistant and susceptible strawberry cultivars. Our findings can enhance breeding efforts in strawberry.

  11. Gene Expression Analysis in Tubule Interstitial Compartments Reveals Candidate Agents for IgA Nephropathy

    Directory of Open Access Journals (Sweden)

    Jinling Wang

    2014-09-01

    Full Text Available Background/Aims: Our aim was to explore the molecular mechanism underlying development of IgA nephropathy and discover candidate agents for IgA nephropathy. Methods: The differentially expressed genes (DEGs between patients with IgA nephropathy and normal controls were identified by the data of GSE35488 downloaded from GEO (Gene Expression Omnibus database. The co-expressed gene pairs among DEGs were screened to construct the gene-gene interaction network. Gene Ontology (GO enrichment analysis was performed to analyze the functions of DEGs. The biologically active small molecules capable of targeting IgA nephropathy were identified using the Connectivity Map (cMap database. Results: A total of 55 genes involved in response to organic substance, transcription factor activity and response to steroid hormone stimulus were identified to be differentially expressed in IgA nephropathy patients compared to healthy individuals. A network with 45 co-expressed gene pairs was constructed. DEGs in the network were significantly enriched in response to organic substance. Additionally, a group of small molecules were identified, such as doxorubicin and thapsigargin. Conclusion: Our work provided a systematic insight in understanding the mechanism of IgA nephropathy. Small molecules such as thapsigargin might be potential candidate agents for the treatment of IgA nephropathy.

  12. Meta-Analysis of Multiple Sclerosis Microarray Data Reveals Dysregulation in RNA Splicing Regulatory Genes

    Directory of Open Access Journals (Sweden)

    Elvezia Maria Paraboschi

    2015-09-01

    Full Text Available Abnormalities in RNA metabolism and alternative splicing (AS are emerging as important players in complex disease phenotypes. In particular, accumulating evidence suggests the existence of pathogenic links between multiple sclerosis (MS and altered AS, including functional studies showing that an imbalance in alternatively-spliced isoforms may contribute to disease etiology. Here, we tested whether the altered expression of AS-related genes represents a MS-specific signature. A comprehensive comparative analysis of gene expression profiles of publicly-available microarray datasets (190 MS cases, 182 controls, followed by gene-ontology enrichment analysis, highlighted a significant enrichment for differentially-expressed genes involved in RNA metabolism/AS. In detail, a total of 17 genes were found to be differentially expressed in MS in multiple datasets, with CELF1 being dysregulated in five out of seven studies. We confirmed CELF1 downregulation in MS (p = 0.0015 by real-time RT-PCRs on RNA extracted from blood cells of 30 cases and 30 controls. As a proof of concept, we experimentally verified the unbalance in alternatively-spliced isoforms in MS of the NFAT5 gene, a putative CELF1 target. In conclusion, for the first time we provide evidence of a consistent dysregulation of splicing-related genes in MS and we discuss its possible implications in modulating specific AS events in MS susceptibility genes.

  13. Assessment Applications of Ontologies.

    Science.gov (United States)

    Chung, Gregory K. W. K.; Niemi, David; Bewley, William L.

    This paper discusses the use of ontologies and their applications to assessment. An ontology provides a shared and common understanding of a domain that can be communicated among people and computational systems. The ontology captures one or more experts' conceptual representation of a domain expressed in terms of concepts and the relationships…

  14. Gene expression profiles reveal key pathways and genes associated with neuropathic pain in patients with spinal cord injury.

    Science.gov (United States)

    He, Xijing; Fan, Liying; Wu, Zhongheng; He, Jiaxuan; Cheng, Bin

    2017-04-01

    Previous gene expression profiling studies of neuropathic pain (NP) following spinal cord injury (SCI) have predominantly been performed in animal models. The present study aimed to investigate gene alterations in patients with spinal cord injury and to further examine the mechanisms underlying NP following SCI. The GSE69901 gene expression profile was downloaded from the public Gene Expression Omnibus database. Samples of peripheral blood mononuclear cells (PBMCs) derived from 12 patients with intractable NP and 13 control patients without pain were analyzed to identify the differentially expressed genes (DEGs), followed by functional enrichment analysis and protein‑protein interaction (PPI) network construction. In addition, a transcriptional regulation network was constructed and functional gene clustering was performed. A total of 70 upregulated and 61 downregulated DEGs were identified in the PBMC samples from patients with NP. The upregulated and downregulated genes were significantly involved in different Gene Ontology terms and pathways, including focal adhesion, T cell receptor signaling pathway and mitochondrial function. Glycogen synthase kinase 3 β (GSK3B) was identified as a hub protein in the PPI network. In addition, ornithine decarboxylase 1 (ODC1) and ornithine aminotransferase (OAT) were regulated by additional transcription factors in the regulation network. GSK3B, OAT and ODC1 were significantly enriched in two functional gene clusters, the function of mitochondrial membrane and DNA binding. Focal adhesion and the T cell receptor signaling pathway may be significantly linked with NP, and GSK3B, OAT and ODC1 may be potential targets for the treatment of NP.

  15. Novel algorithms reveal streptococcal transcriptomes and clues about undefined genes.

    Science.gov (United States)

    Ryan, Patricia A; Kirk, Brian W; Euler, Chad W; Schuch, Raymond; Fischetti, Vincent A

    2007-07-01

    Bacteria-host interactions are dynamic processes, and understanding transcriptional responses that directly or indirectly regulate the expression of genes involved in initial infection stages would illuminate the molecular events that result in host colonization. We used oligonucleotide microarrays to monitor (in vitro) differential gene expression in group A streptococci during pharyngeal cell adherence, the first overt infection stage. We present neighbor clustering, a new computational method for further analyzing bacterial microarray data that combines two informative characteristics of bacterial genes that share common function or regulation: (1) similar gene expression profiles (i.e., co-expression); and (2) physical proximity of genes on the chromosome. This method identifies statistically significant clusters of co-expressed gene neighbors that potentially share common function or regulation by coupling statistically analyzed gene expression profiles with the chromosomal position of genes. We applied this method to our own data and to those of others, and we show that it identified a greater number of differentially expressed genes, facilitating the reconstruction of more multimeric proteins and complete metabolic pathways than would have been possible without its application. We assessed the biological significance of two identified genes by assaying deletion mutants for adherence in vitro and show that neighbor clustering indeed provides biologically relevant data. Neighbor clustering provides a more comprehensive view of the molecular responses of streptococci during pharyngeal cell adherence.

  16. Ontologies vs. Classification Systems

    DEFF Research Database (Denmark)

    Madsen, Bodil Nistrup; Erdman Thomsen, Hanne

    2009-01-01

    What is an ontology compared to a classification system? Is a taxonomy a kind of classification system or a kind of ontology? These are questions that we meet when working with people from industry and public authorities, who need methods and tools for concept clarification, for developing meta...... data sets or for obtaining advanced search facilities. In this paper we will present an attempt at answering these questions. We will give a presentation of various types of ontologies and briefly introduce terminological ontologies. Furthermore we will argue that classification systems, e.g. product...... classification systems and meta data taxonomies, should be based on ontologies....

  17. Toxicology ontology perspectives.

    Science.gov (United States)

    Hardy, Barry; Apic, Gordana; Carthew, Philip; Clark, Dominic; Cook, David; Dix, Ian; Escher, Sylvia; Hastings, Janna; Heard, David J; Jeliazkova, Nina; Judson, Philip; Matis-Mitchell, Sherri; Mitic, Dragana; Myatt, Glenn; Shah, Imran; Spjuth, Ola; Tcheremenskaia, Olga; Toldo, Luca; Watson, David; White, Andrew; Yang, Chihae

    2012-01-01

    The field of predictive toxicology requires the development of open, public, computable, standardized toxicology vocabularies and ontologies to support the applications required by in silico, in vitro, and in vivo toxicology methods and related analysis and reporting activities. In this article we review ontology developments based on a set of perspectives showing how ontologies are being used in predictive toxicology initiatives and applications. Perspectives on resources and initiatives reviewed include OpenTox, eTOX, Pistoia Alliance, ToxWiz, Virtual Liver, EU-ADR, BEL, ToxML, and Bioclipse. We also review existing ontology developments in neighboring fields that can contribute to establishing an ontological framework for predictive toxicology. A significant set of resources is already available to provide a foundation for an ontological framework for 21st century mechanistic-based toxicology research. Ontologies such as ToxWiz provide a basis for application to toxicology investigations, whereas other ontologies under development in the biological, chemical, and biomedical communities could be incorporated in an extended future framework. OpenTox has provided a semantic web framework for the implementation of such ontologies into software applications and linked data resources. Bioclipse developers have shown the benefit of interoperability obtained through ontology by being able to link their workbench application with remote OpenTox web services. Although these developments are promising, an increased international coordination of efforts is greatly needed to develop a more unified, standardized, and open toxicology ontology framework.

  18. Gene expression profiling reveals multiple toxicity endpoints induced by hepatotoxicants

    Energy Technology Data Exchange (ETDEWEB)

    Huang Qihong; Jin Xidong; Gaillard, Elias T.; Knight, Brian L.; Pack, Franklin D.; Stoltz, James H.; Jayadev, Supriya; Blanchard, Kerry T

    2004-05-18

    Microarray technology continues to gain increased acceptance in the drug development process, particularly at the stage of toxicology and safety assessment. In the current study, microarrays were used to investigate gene expression changes associated with hepatotoxicity, the most commonly reported clinical liability with pharmaceutical agents. Acetaminophen, methotrexate, methapyrilene, furan and phenytoin were used as benchmark compounds capable of inducing specific but different types of hepatotoxicity. The goal of the work was to define gene expression profiles capable of distinguishing the different subtypes of hepatotoxicity. Sprague-Dawley rats were orally dosed with acetaminophen (single dose, 4500 mg/kg for 6, 24 and 72 h), methotrexate (1 mg/kg per day for 1, 7 and 14 days), methapyrilene (100 mg/kg per day for 3 and 7 days), furan (40 mg/kg per day for 1, 3, 7 and 14 days) or phenytoin (300 mg/kg per day for 14 days). Hepatic gene expression was assessed using toxicology-specific gene arrays containing 684 target genes or expressed sequence tags (ESTs). Principal component analysis (PCA) of gene expression data was able to provide a clear distinction of each compound, suggesting that gene expression data can be used to discern different hepatotoxic agents and toxicity endpoints. Gene expression data were applied to the multiplicity-adjusted permutation test and significantly changed genes were categorized and correlated to hepatotoxic endpoints. Repression of enzymes involved in lipid oxidation (acyl-CoA dehydrogenase, medium chain, enoyl CoA hydratase, very long-chain acyl-CoA synthetase) were associated with microvesicular lipidosis. Likewise, subsets of genes associated with hepatotocellular necrosis, inflammation, hepatitis, bile duct hyperplasia and fibrosis have been identified. The current study illustrates that expression profiling can be used to: (1) distinguish different hepatotoxic endpoints; (2) predict the development of toxic endpoints; and

  19. Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes.

    Science.gov (United States)

    Biankin, Andrew V; Waddell, Nicola; Kassahn, Karin S; Gingras, Marie-Claude; Muthuswamy, Lakshmi B; Johns, Amber L; Miller, David K; Wilson, Peter J; Patch, Ann-Marie; Wu, Jianmin; Chang, David K; Cowley, Mark J; Gardiner, Brooke B; Song, Sarah; Harliwong, Ivon; Idrisoglu, Senel; Nourse, Craig; Nourbakhsh, Ehsan; Manning, Suzanne; Wani, Shivangi; Gongora, Milena; Pajic, Marina; Scarlett, Christopher J; Gill, Anthony J; Pinho, Andreia V; Rooman, Ilse; Anderson, Matthew; Holmes, Oliver; Leonard, Conrad; Taylor, Darrin; Wood, Scott; Xu, Qinying; Nones, Katia; Fink, J Lynn; Christ, Angelika; Bruxner, Tim; Cloonan, Nicole; Kolle, Gabriel; Newell, Felicity; Pinese, Mark; Mead, R Scott; Humphris, Jeremy L; Kaplan, Warren; Jones, Marc D; Colvin, Emily K; Nagrial, Adnan M; Humphrey, Emily S; Chou, Angela; Chin, Venessa T; Chantrill, Lorraine A; Mawson, Amanda; Samra, Jaswinder S; Kench, James G; Lovell, Jessica A; Daly, Roger J; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Kakkar, Nipun; Zhao, Fengmei; Wu, Yuan Qing; Wang, Min; Muzny, Donna M; Fisher, William E; Brunicardi, F Charles; Hodges, Sally E; Reid, Jeffrey G; Drummond, Jennifer; Chang, Kyle; Han, Yi; Lewis, Lora R; Dinh, Huyen; Buhay, Christian J; Beck, Timothy; Timms, Lee; Sam, Michelle; Begley, Kimberly; Brown, Andrew; Pai, Deepa; Panchal, Ami; Buchner, Nicholas; De Borja, Richard; Denroche, Robert E; Yung, Christina K; Serra, Stefano; Onetto, Nicole; Mukhopadhyay, Debabrata; Tsao, Ming-Sound; Shaw, Patricia A; Petersen, Gloria M; Gallinger, Steven; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Schulick, Richard D; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Capelli, Paola; Corbo, Vincenzo; Scardoni, Maria; Tortora, Giampaolo; Tempero, Margaret A; Mann, Karen M; Jenkins, Nancy A; Perez-Mancera, Pedro A; Adams, David J; Largaespada, David A; Wessels, Lodewyk F A; Rust, Alistair G; Stein, Lincoln D; Tuveson, David A; Copeland, Neal G; Musgrove, Elizabeth A; Scarpa, Aldo; Eshleman, James R; Hudson, Thomas J; Sutherland, Robert L; Wheeler, David A; Pearson, John V; McPherson, John D; Gibbs, Richard A; Grimmond, Sean M

    2012-11-15

    Pancreatic cancer is a highly lethal malignancy with few effective therapies. We performed exome sequencing and copy number analysis to define genomic aberrations in a prospectively accrued clinical cohort (n = 142) of early (stage I and II) sporadic pancreatic ductal adenocarcinoma. Detailed analysis of 99 informative tumours identified substantial heterogeneity with 2,016 non-silent mutations and 1,628 copy-number variations. We define 16 significantly mutated genes, reaffirming known mutations (KRAS, TP53, CDKN2A, SMAD4, MLL3, TGFBR2, ARID1A and SF3B1), and uncover novel mutated genes including additional genes involved in chromatin modification (EPC1 and ARID2), DNA damage repair (ATM) and other mechanisms (ZIM2, MAP2K4, NALCN, SLC16A4 and MAGEA6). Integrative analysis with in vitro functional data and animal models provided supportive evidence for potential roles for these genetic aberrations in carcinogenesis. Pathway-based analysis of recurrently mutated genes recapitulated clustering in core signalling pathways in pancreatic ductal adenocarcinoma, and identified new mutated genes in each pathway. We also identified frequent and diverse somatic aberrations in genes described traditionally as embryonic regulators of axon guidance, particularly SLIT/ROBO signalling, which was also evident in murine Sleeping Beauty transposon-mediated somatic mutagenesis models of pancreatic cancer, providing further supportive evidence for the potential involvement of axon guidance genes in pancreatic carcinogenesis.

  20. Transcriptome analysis of paired primary colorectal carcinoma and liver metastases reveals fusion transcripts and similar gene expression profiles in primary carcinoma and liver metastases

    International Nuclear Information System (INIS)

    Lee, Ja-Rang; Kwon, Chae Hwa; Choi, Yuri; Park, Hye Ji; Kim, Hyun Sung; Jo, Hong-Jae; Oh, Nahmgun; Park, Do Youn

    2016-01-01

    Despite the clinical significance of liver metastases, the difference between molecular and cellular changes in primary colorectal cancers (CRC) and matched liver metastases is poorly understood. In order to compare gene expression patterns and identify fusion genes in these two types of tumors, we performed high-throughput transcriptome sequencing of five sets of quadruple-matched tissues (primary CRC, liver metastases, normal colon, and liver). The gene expression patterns in normal colon and liver were successfully distinguished from those in CRCs; however, RNA sequencing revealed that the gene expression between primary CRCs and their matched liver metastases is highly similar. We identified 1895 genes that were differentially expressed in the primary carcinoma and liver metastases, than that in the normal colon tissues. A major proportion of the transcripts, identified by gene expression profiling as significantly enriched in the primary carcinoma and metastases, belonged to gene ontology categories involved in the cell cycle, mitosis, and cell division. Furthermore, we identified gene fusion events in primary carcinoma and metastases, and the fusion transcripts were experimentally confirmed. Among these, a chimeric transcript resulting from the fusion of RNF43 and SUPT4H1 was found to occur frequently in primary colorectal carcinoma. In addition, knockdown of the expression of this RNF43-SUPT4H1 chimeric transcript was found to have a growth-inhibitory effect in colorectal cancer cells. The present study reports a high concordance of gene expression in the primary carcinoma and liver metastases, and reveals potential new targets, such as fusion genes, against primary and metastatic colorectal carcinoma. The online version of this article (doi:10.1186/s12885-016-2596-3) contains supplementary material, which is available to authorized users

  1. Functional gene polymorphism to reveal species history: the case of the CRTISO gene in cultivated carrots.

    Directory of Open Access Journals (Sweden)

    Vanessa Soufflet-Freslon

    Full Text Available Carrot is a vegetable cultivated worldwide for the consumption of its root. Historical data indicate that root colour has been differentially selected over time and according to geographical areas. Root pigmentation depends on the relative proportion of different carotenoids for the white, yellow, orange and red types but only internally for the purple one. The genetic control for root carotenoid content might be partially associated with carotenoid biosynthetic genes. Carotenoid isomerase (CRTISO has emerged as a regulatory step in the carotenoid biosynthesis pathway and could be a good candidate to show how a metabolic pathway gene reflects a species genetic history.In this study, the nucleotide polymorphism and the linkage disequilibrium among the complete CRTISO sequence, and the deviation from neutral expectation were analysed by considering population subdivision revealed with 17 microsatellite markers. A sample of 39 accessions, which represented different geographical origins and root colours, was used. Cultivated carrot was divided into two genetic groups: one from Middle East and Asia (Eastern group, and another one mainly from Europe (Western group. The Western and Eastern genetic groups were suggested to be differentially affected by selection: a signature of balancing selection was detected within the first group whereas the second one showed no selection. A focus on orange-rooted carrots revealed that cultivars cultivated in Asia were mainly assigned to the Western group but showed CRTISO haplotypes common to Eastern carrots.The carotenoid pathway CRTISO gene data proved to be complementary to neutral markers in order to bring critical insight in the cultivated carrot history. We confirmed the occurrence of two migration events since domestication. Our results showed a European background in material from Japan and Central Asia. While confirming the introduction of European carrots in Japanese resources, the history of Central Asia

  2. Quantum physics and relational ontology

    Energy Technology Data Exchange (ETDEWEB)

    Cordovil, Joao [Center of Philosophy of Sciences of University of Lisbon (Portugal)

    2013-07-01

    The discovery of the quantum domain of reality put a serious ontological challenge, a challenge that is still well present in the recent developments of Quantum Physics. Physics was conceived from an atomistic conception of the world, reducing it, in all its diversity, to two types of entities: simple, individual and immutable entities (atoms, in metaphysical sense) and composite entities, resulting solely from combinations. Linear combinations, additive, indifferent to the structure or to the context. However, the discovery of wave-particle dualism and the developments in Quantum Field Theories and in Quantum Nonlinear Physical, showed that quantum entities are not, in metaphysical sense, neither simple, nor merely the result of linear (or additive) combinations. In other words, the ontological foundations of Physics revealed as inadequate to account for the nature of quantum entities. Then a fundamental challenge arises: How to think the ontic nature of these entities? In my view, this challenge appeals to a relational and dynamist ontology of physical entities. This is the central hypothesis of this communication. In this sense, this communication has two main intentions: 1) positively characterize this relational and dynamist ontology; 2) show some elements of its metaphysical suitability to contemporary Quantum Physics.

  3. Different gene expression patterns between leaves and flowers in Lonicera japonica revealed by transcriptome analysis

    Directory of Open Access Journals (Sweden)

    Libin eZhang

    2016-05-01

    Full Text Available The perennial and evergreen twining vine, Lonicera japonica is an important herbal medicine with great economic value. However, gene expression information for flowers and leaves of L. japonica remains elusive, which greatly impedes functional genomics research on this species. In this study, transcriptome profiles from leaves and flowers of L. japonica were examined using next-generation sequencing technology. A total of 239.41 million clean reads were used for de novo assembly with Trinity software, which generated 150,523 unigenes with N50 containing 947 bp. All the unigenes were annotated using Nr, SwissProt, COGs (Clusters of Orthologous Groups, GO (Gene Ontology and KEGG (Kyoto Encyclopedia of Genes and Genomes databases. A total of 35,327 differentially expressed genes (DEGs, P≤0.05 between leaves and flowers were detected. Among them, a total of 6,602 DEGs were assigned with important biological processes including Metabolic process, Response to stimulus, Cellular process and etc. KEGG analysis showed that three possible enzymes involved in the biosynthesis of chlorogenic acid were up-regulated in flowers. Furthermore, the TF-based regulation network in L. japonica identified three differentially expressed transcription factors between leaves and flowers, suggesting distinct regulatory roles in L. japonica. Taken together, this study has provided a global picture of differential gene expression patterns between leaves and flowers in L japonica, providing a useful genomic resource that can also be used for functional genomics research on L. japonica in the future.

  4. Phylogenetic analysis of ferlin genes reveals ancient eukaryotic origins

    Directory of Open Access Journals (Sweden)

    Lek Monkol

    2010-07-01

    Full Text Available Abstract Background The ferlin gene family possesses a rare and identifying feature consisting of multiple tandem C2 domains and a C-terminal transmembrane domain. Much currently remains unknown about the fundamental function of this gene family, however, mutations in its two most well-characterised members, dysferlin and otoferlin, have been implicated in human disease. The availability of genome sequences from a wide range of species makes it possible to explore the evolution of the ferlin family, providing contextual insight into characteristic features that define the ferlin gene family in its present form in humans. Results Ferlin genes were detected from all species of representative phyla, with two ferlin subgroups partitioned within the ferlin phylogenetic tree based on the presence or absence of a DysF domain. Invertebrates generally possessed two ferlin genes (one with DysF and one without, with six ferlin genes in most vertebrates (three DysF, three non-DysF. Expansion of the ferlin gene family is evident between the divergence of lamprey (jawless vertebrates and shark (cartilaginous fish. Common to almost all ferlins is an N-terminal C2-FerI-C2 sandwich, a FerB motif, and two C-terminal C2 domains (C2E and C2F adjacent to the transmembrane domain. Preservation of these structural elements throughout eukaryotic evolution suggests a fundamental role of these motifs for ferlin function. In contrast, DysF, C2DE, and FerA are optional, giving rise to subtle differences in domain topologies of ferlin genes. Despite conservation of multiple C2 domains in all ferlins, the C-terminal C2 domains (C2E and C2F displayed higher sequence conservation and greater conservation of putative calcium binding residues across paralogs and orthologs. Interestingly, the two most studied non-mammalian ferlins (Fer-1 and Misfire in model organisms C. elegans and D. melanogaster, present as outgroups in the phylogenetic analysis, with results suggesting

  5. Heart morphogenesis gene regulatory networks revealed by temporal expression analysis.

    Science.gov (United States)

    Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph

    2017-10-01

    During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.

  6. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    Directory of Open Access Journals (Sweden)

    Antommattei Frances M

    2008-10-01

    Full Text Available Abstract Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70 homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively. Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors

  7. Genetic diversity and gene flow revealed by microsatellite DNA ...

    African Journals Online (AJOL)

    Dacryodes edulis is a multipurpose tree integrated in the cropping system of Central African region still dominated by subsistence agriculture. Some populations grown are wild which can provide information on the domestication process, and could also represent a potential source of gene flow. Leaves samples for DNA ...

  8. Comparative genome analysis of PHB gene family reveals deep evolutionary origins and diverse gene function.

    Science.gov (United States)

    Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S

    2010-10-07

    PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out

  9. Integrating phenotype ontologies with PhenomeNET

    KAUST Repository

    Rodriguez-Garcia, Miguel Angel

    2017-12-19

    Background Integration and analysis of phenotype data from humans and model organisms is a key challenge in building our understanding of normal biology and pathophysiology. However, the range of phenotypes and anatomical details being captured in clinical and model organism databases presents complex problems when attempting to match classes across species and across phenotypes as diverse as behaviour and neoplasia. We have previously developed PhenomeNET, a system for disease gene prioritization that includes as one of its components an ontology designed to integrate phenotype ontologies. While not applicable to matching arbitrary ontologies, PhenomeNET can be used to identify related phenotypes in different species, including human, mouse, zebrafish, nematode worm, fruit fly, and yeast. Results Here, we apply the PhenomeNET to identify related classes from two phenotype and two disease ontologies using automated reasoning. We demonstrate that we can identify a large number of mappings, some of which require automated reasoning and cannot easily be identified through lexical approaches alone. Combining automated reasoning with lexical matching further improves results in aligning ontologies. Conclusions PhenomeNET can be used to align and integrate phenotype ontologies. The results can be utilized for biomedical analyses in which phenomena observed in model organisms are used to identify causative genes and mutations underlying human disease.

  10. ESTs analysis reveals putative genes involved in symbiotic seed germination in Dendrobium officinale.

    Science.gov (United States)

    Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing

    2013-01-01

    Dendrobiumofficinale (Orchidaceae) is one of the world's most endangered plants with great medicinal value. In nature, D. officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D. officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e(-5)). Based on sequence similarity with known proteins, 579 differentially expressed genes in D. officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D. officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D. officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids.

  11. ESTs Analysis Reveals Putative Genes Involved in Symbiotic Seed Germination in Dendrobium officinale

    Science.gov (United States)

    Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing

    2013-01-01

    Dendrobium officinale (Orchidaceae) is one of the world’s most endangered plants with great medicinal value. In nature, D . officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D . officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e-5). Based on sequence similarity with known proteins, 579 differentially expressed genes in D . officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D . officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D . officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids. PMID:23967335

  12. ESTs analysis reveals putative genes involved in symbiotic seed germination in Dendrobium officinale.

    Directory of Open Access Journals (Sweden)

    Ming-Ming Zhao

    Full Text Available Dendrobiumofficinale (Orchidaceae is one of the world's most endangered plants with great medicinal value. In nature, D. officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH cDNA library of symbiotically germinated D. officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs were clustered to 1074 Unigenes (including 902 singletons and 172 contigs, which were searched against the NCBI non-redundant (NR protein database (E-value cutoff, e(-5. Based on sequence similarity with known proteins, 579 differentially expressed genes in D. officinale were identified and classified into different functional categories by Gene Ontology (GO, Clusters of orthologous Groups of proteins (COGs and Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS. The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs, which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS, were cloned from D. officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D. officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids.

  13. MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.

    Science.gov (United States)

    Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang

    2018-03-10

    Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.

  14. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Energy Technology Data Exchange (ETDEWEB)

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  15. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Directory of Open Access Journals (Sweden)

    Chen Qi

    2011-02-01

    Full Text Available Abstract Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs. Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010. Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were

  16. A systems level approach reveals new gene regulatory modules in the developing ear

    OpenAIRE

    Chen, Jingchen; Tambalo, Monica; Barembaum, Meyer; Ranganathan, Ramya; Simões-Costa, Marcos; Bronner, Marianne E.; Streit, Andrea

    2017-01-01

    The inner ear is a complex vertebrate sense organ, yet it arises from a simple epithelium, the otic placode. Specification towards otic fate requires diverse signals and transcriptional inputs that act sequentially and/or in parallel. Using the chick embryo, we uncover novel genes in the gene regulatory network underlying otic commitment and reveal dynamic changes in gene expression. Functional analysis of selected transcription factors reveals the genetic hierarchy underlying the transition ...

  17. Listening to the Noise: Random Fluctuations Reveal Gene Network Parameters

    Science.gov (United States)

    Munsky, Brian; Trinh, Brooke; Khammash, Mustafa

    2010-03-01

    The cellular environment is abuzz with noise originating from the inherent random motion of reacting molecules in the living cell. In this noisy environment, clonal cell populations exhibit cell-to-cell variability that can manifest significant prototypical differences. Noise induced stochastic fluctuations in cellular constituents can be measured and their statistics quantified using flow cytometry, single molecule fluorescence in situ hybridization, time lapse fluorescence microscopy and other single cell and single molecule measurement techniques. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich source of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. We use theoretical investigations to establish experimental guidelines for the identification of gene regulatory networks, and we apply these guideline to experimentally identify predictive models for different regulatory mechanisms in bacteria and yeast.

  18. Constructive Ontology Engineering

    Science.gov (United States)

    Sousan, William L.

    2010-01-01

    The proliferation of the Semantic Web depends on ontologies for knowledge sharing, semantic annotation, data fusion, and descriptions of data for machine interpretation. However, ontologies are difficult to create and maintain. In addition, their structure and content may vary depending on the application and domain. Several methods described in…

  19. A UML profile for the OBO relation ontology

    Science.gov (United States)

    2012-01-01

    Background Ontologies have increasingly been used in the biomedical domain, which has prompted the emergence of different initiatives to facilitate their development and integration. The Open Biological and Biomedical Ontologies (OBO) Foundry consortium provides a repository of life-science ontologies, which are developed according to a set of shared principles. This consortium has developed an ontology called OBO Relation Ontology aiming at standardizing the different types of biological entity classes and associated relationships. Since ontologies are primarily intended to be used by humans, the use of graphical notations for ontology development facilitates the capture, comprehension and communication of knowledge between its users. However, OBO Foundry ontologies are captured and represented basically using text-based notations. The Unified Modeling Language (UML) provides a standard and widely-used graphical notation for modeling computer systems. UML provides a well-defined set of modeling elements, which can be extended using a built-in extension mechanism named Profile. Thus, this work aims at developing a UML profile for the OBO Relation Ontology to provide a domain-specific set of modeling elements that can be used to create standard UML-based ontologies in the biomedical domain. Results We have studied the OBO Relation Ontology, the UML metamodel and the UML profiling mechanism. Based on these studies, we have proposed an extension to the UML metamodel in conformance with the OBO Relation Ontology and we have defined a profile that implements the extended metamodel. Finally, we have applied the proposed UML profile in the development of a number of fragments from different ontologies. Particularly, we have considered the Gene Ontology (GO), the PRotein Ontology (PRO) and the Xenopus Anatomy and Development Ontology (XAO). Conclusions The use of an established and well-known graphical language in the development of biomedical ontologies provides a more

  20. Predicting protein-protein interactions in Arabidopsis thaliana through integration of orthology, gene ontology and co-expression

    Directory of Open Access Journals (Sweden)

    Vandepoele Klaas

    2009-06-01

    Full Text Available Abstract Background Large-scale identification of the interrelationships between different components of the cell, such as the interactions between proteins, has recently gained great interest. However, unraveling large-scale protein-protein interaction maps is laborious and expensive. Moreover, assessing the reliability of the interactions can be cumbersome. Results In this study, we have developed a computational method that exploits the existing knowledge on protein-protein interactions in diverse species through orthologous relations on the one hand, and functional association data on the other hand to predict and filter protein-protein interactions in Arabidopsis thaliana. A highly reliable set of protein-protein interactions is predicted through this integrative approach making use of existing protein-protein interaction data from yeast, human, C. elegans and D. melanogaster. Localization, biological process, and co-expression data are used as powerful indicators for protein-protein interactions. The functional repertoire of the identified interactome reveals interactions between proteins functioning in well-conserved as well as plant-specific biological processes. We observe that although common mechanisms (e.g. actin polymerization and components (e.g. ARPs, actin-related proteins exist between different lineages, they are active in specific processes such as growth, cancer metastasis and trichome development in yeast, human and Arabidopsis, respectively. Conclusion We conclude that the integration of orthology with functional association data is adequate to predict protein-protein interactions. Through this approach, a high number of novel protein-protein interactions with diverse biological roles is discovered. Overall, we have predicted a reliable set of protein-protein interactions suitable for further computational as well as experimental analyses.

  1. Genome Wide Expression Profiling of Cancer Cell Lines Cultured in Microgravity Reveals Significant Dysregulation of Cell Cycle and MicroRNA Gene Networks.

    Directory of Open Access Journals (Sweden)

    Prasanna Vidyasekar

    Full Text Available Zero gravity causes several changes in metabolic and functional aspects of the human body and experiments in space flight have demonstrated alterations in cancer growth and progression. This study reports the genome wide expression profiling of a colorectal cancer cell line-DLD-1, and a lymphoblast leukemic cell line-MOLT-4, under simulated microgravity in an effort to understand central processes and cellular functions that are dysregulated among both cell lines. Altered cell morphology, reduced cell viability and an aberrant cell cycle profile in comparison to their static controls were observed in both cell lines under microgravity. The process of cell cycle in DLD-1 cells was markedly affected with reduced viability, reduced colony forming ability, an apoptotic population and dysregulation of cell cycle genes, oncogenes, and cancer progression and prognostic markers. DNA microarray analysis revealed 1801 (upregulated and 2542 (downregulated genes (>2 fold in DLD-1 cultures under microgravity while MOLT-4 cultures differentially expressed 349 (upregulated and 444 (downregulated genes (>2 fold under microgravity. The loss in cell proliferative capacity was corroborated with the downregulation of the cell cycle process as demonstrated by functional clustering of DNA microarray data using gene ontology terms. The genome wide expression profile also showed significant dysregulation of post transcriptional gene silencing machinery and multiple microRNA host genes that are potential tumor suppressors and proto-oncogenes including MIR22HG, MIR17HG and MIR21HG. The MIR22HG, a tumor-suppressor gene was one of the highest upregulated genes in the microarray data showing a 4.4 log fold upregulation under microgravity. Real time PCR validated the dysregulation in the host gene by demonstrating a 4.18 log fold upregulation of the miR-22 microRNA. Microarray data also showed dysregulation of direct targets of miR-22, SP1, CDK6 and CCNA2.

  2. Towards Agile Ontology Maintenance

    Science.gov (United States)

    Luczak-Rösch, Markus

    Ontologies are an appropriate means to represent knowledge on the Web. Research on ontology engineering reached practices for an integrative lifecycle support. However, a broader success of ontologies in Web-based information systems remains unreached while the more lightweight semantic approaches are rather successful. We assume, paired with the emerging trend of services and microservices on the Web, new dynamic scenarios gain momentum in which a shared knowledge base is made available to several dynamically changing services with disparate requirements. Our work envisions a step towards such a dynamic scenario in which an ontology adapts to the requirements of the accessing services and applications as well as the user's needs in an agile way and reduces the experts' involvement in ontology maintenance processes.

  3. Conceptual querying through ontologies

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2009-01-01

    is motivated by an obvious need for users to survey huge volumes of objects in query answers. An ontology formalism and a special notion of-instantiated ontology" are introduced. The latter is a structure reflecting the content in the document collection in that; it is a restriction of a general world......We present here ail approach to conceptual querying where the aim is, given a collection of textual database objects or documents, to target an abstraction of the entire database content in terms of the concepts appearing in documents, rather than the documents in the collection. The approach...... knowledge ontology to the concepts instantiated in the collection. The notion of ontology-based similarity is briefly described, language constructs for direct navigation and retrieval of concepts in the ontology are discussed and approaches to conceptual summarization are presented....

  4. Survey on Ontology Mapping

    Science.gov (United States)

    Zhu, Junwu

    To create a sharable semantic space in which the terms from different domain ontology or knowledge system, Ontology mapping become a hot research point in Semantic Web Community. In this paper, motivated factors of ontology mapping research are given firstly, and then 5 dominating theories and methods, such as information accessing technology, machine learning, linguistics, structure graph and similarity, are illustrated according their technology class. Before we analyses the new requirements and takes a long view, the contributions of these theories and methods are summarized in details. At last, this paper suggest to design a group of semantic connector with the ability of migration learning for OWL-2 extended with constrains and the ontology mapping theory of axiom, so as to provide a new methodology for ontology mapping.

  5. Annotating breast cancer microarray samples using ontologies

    Science.gov (United States)

    Liu, Hongfang; Li, Xin; Yoon, Victoria; Clarke, Robert

    2008-01-01

    As the most common cancer among women, breast cancer results from the accumulation of mutations in essential genes. Recent advance in high-throughput gene expression microarray technology has inspired researchers to use the technology to assist breast cancer diagnosis, prognosis, and treatment prediction. However, the high dimensionality of microarray experiments and public access of data from many experiments have caused inconsistencies which initiated the development of controlled terminologies and ontologies for annotating microarray experiments, such as the standard microarray Gene Expression Data (MGED) ontology (MO). In this paper, we developed BCM-CO, an ontology tailored specifically for indexing clinical annotations of breast cancer microarray samples from the NCI Thesaurus. Our research showed that the coverage of NCI Thesaurus is very limited with respect to i) terms used by researchers to describe breast cancer histology (covering 22 out of 48 histology terms); ii) breast cancer cell lines (covering one out of 12 cell lines); and iii) classes corresponding to the breast cancer grading and staging. By incorporating a wider range of those terms into BCM-CO, we were able to indexed breast cancer microarray samples from GEO using BCM-CO and MGED ontology and developed a prototype system with web interface that allows the retrieval of microarray data based on the ontology annotations. PMID:18999108

  6. Quality control for terms and definitions in ontologies and taxonomies

    Directory of Open Access Journals (Sweden)

    Rüegg Alexander

    2006-04-01

    Full Text Available Abstract Background Ontologies and taxonomies are among the most important computational resources for molecular biology and bioinformatics. A series of recent papers has shown that the Gene Ontology (GO, the most prominent taxonomic resource in these fields, is marked by flaws of certain characteristic types, which flow from a failure to address basic ontological principles. As yet, no methods have been proposed which would allow ontology curators to pinpoint flawed terms or definitions in ontologies in a systematic way. Results We present computational methods that automatically identify terms and definitions which are defined in a circular or unintelligible way. We further demonstrate the potential of these methods by applying them to isolate a subset of 6001 problematic GO terms. By automatically aligning GO with other ontologies and taxonomies we were able to propose alternative synonyms and definitions for some of these problematic terms. This allows us to demonstrate that these other resources do not contain definitions superior to those supplied by GO. Conclusion Our methods provide reliable indications of the quality of terms and definitions in ontologies and taxonomies. Further, they are well suited to assist ontology curators in drawing their attention to those terms that are ill-defined. We have further shown the limitations of ontology mapping and alignment in assisting ontology curators in rectifying problems, thus pointing to the need for manual curation.

  7. Practical ontologies for information professionals

    CERN Document Server

    AUTHOR|(CDS)2071712

    2016-01-01

    Practical Ontologies for Information Professionals provides an introduction to ontologies and their development, an essential tool for fighting back against information overload. The development of robust and widely used ontologies is an increasingly important tool in the fight against information overload. The publishing and sharing of explicit explanations for a wide variety of conceptualizations, in a machine readable format, has the power to both improve information retrieval and identify new knowledge. This new book provides an accessible introduction to the following: * What is an ontology? Defining the concept and why it is increasingly important to the information professional * Ontologies and the semantic web * Existing ontologies, such as SKOS, OWL, FOAF, schema.org, and the DBpedia Ontology * Adopting and building ontologies, showing how to avoid repetition of work and how to build a simple ontology with Protege * Interrogating semantic web ontologies * The future of ontologies and the role of the ...

  8. Gene expression in chicken reveals correlation with structural genomic features and conserved patterns of transcription in the terrestrial vertebrates.

    Directory of Open Access Journals (Sweden)

    Haisheng Nie

    Full Text Available BACKGROUND: The chicken is an important agricultural and avian-model species. A survey of gene expression in a range of different tissues will provide a benchmark for understanding expression levels under normal physiological conditions in birds. With expression data for birds being very scant, this benchmark is of particular interest for comparative expression analysis among various terrestrial vertebrates. METHODOLOGY/PRINCIPAL FINDINGS: We carried out a gene expression survey in eight major chicken tissues using whole genome microarrays. A global picture of gene expression is presented for the eight tissues, and tissue specific as well as common gene expression were identified. A Gene Ontology (GO term enrichment analysis showed that tissue-specific genes are enriched with GO terms reflecting the physiological functions of the specific tissue, and housekeeping genes are enriched with GO terms related to essential biological functions. Comparisons of structural genomic features between tissue-specific genes and housekeeping genes show that housekeeping genes are more compact. Specifically, coding sequence and particularly introns are shorter than genes that display more variation in expression between tissues, and in addition intergenic space was also shorter. Meanwhile, housekeeping genes are more likely to co-localize with other abundantly or highly expressed genes on the same chromosomal regions. Furthermore, comparisons of gene expression in a panel of five common tissues between birds, mammals and amphibians showed that the expression patterns across tissues are highly similar for orthologous genes compared to random gene pairs within each pair-wise comparison, indicating a high degree of functional conservation in gene expression among terrestrial vertebrates. CONCLUSIONS: The housekeeping genes identified in this study have shorter gene length, shorter coding sequence length, shorter introns, and shorter intergenic regions, there seems

  9. Ontological foundations for evolutionary economics: A Darwinian social ontology

    NARCIS (Netherlands)

    Stoelhorst, J.W.

    2008-01-01

    The purpose of this paper is to further the project of generalized Darwinism by developing a social ontology on the basis of a combined commitment to ontological continuity and ontological commonality. Three issues that are central to the development of a social ontology are addressed: (1) the

  10. Ontological realism: A methodology for coordinated evolution of scientific ontologies.

    Science.gov (United States)

    Smith, Barry; Ceusters, Werner

    2010-11-15

    Since 2002 we have been testing and refining a methodology for ontology development that is now being used by multiple groups of researchers in different life science domains. Gary Merrill, in a recent paper in this journal, describes some of the reasons why this methodology has been found attractive by researchers in the biological and biomedical sciences. At the same time he assails the methodology on philosophical grounds, focusing specifically on our recommendation that ontologies developed for scientific purposes should be constructed in such a way that their terms are seen as referring to what we call universals or types in reality. As we show, Merrill's critique is of little relevance to the success of our realist project, since it not only reveals no actual errors in our work but also criticizes views on universals that we do not in fact hold. However, it nonetheless provides us with a valuable opportunity to clarify the realist methodology, and to show how some of its principles are being applied, especially within the framework of the OBO (Open Biomedical Ontologies) Foundry initiative.

  11. DNA capture reveals transoceanic gene flow in endangered river sharks.

    Science.gov (United States)

    Li, Chenhong; Corrigan, Shannon; Yang, Lei; Straube, Nicolas; Harris, Mark; Hofreiter, Michael; White, William T; Naylor, Gavin J P

    2015-10-27

    For over a hundred years, the "river sharks" of the genus Glyphis were only known from the type specimens of species that had been collected in the 19th century. They were widely considered extinct until populations of Glyphis-like sharks were rediscovered in remote regions of Borneo and Northern Australia at the end of the 20th century. However, the genetic affinities between the newly discovered Glyphis-like populations and the poorly preserved, original museum-type specimens have never been established. Here, we present the first (to our knowledge) fully resolved, complete phylogeny of Glyphis that includes both archival-type specimens and modern material. We used a sensitive DNA hybridization capture method to obtain complete mitochondrial genomes from all of our samples and show that three of the five described river shark species are probably conspecific and widely distributed in Southeast Asia. Furthermore we show that there has been recent gene flow between locations that are separated by large oceanic expanses. Our data strongly suggest marine dispersal in these species, overturning the widely held notion that river sharks are restricted to freshwater. It seems that species in the genus Glyphis are euryhaline with an ecology similar to the bull shark, in which adult individuals live in the ocean while the young grow up in river habitats with reduced predation pressure. Finally, we discovered a previously unidentified species within the genus Glyphis that is deeply divergent from all other lineages, underscoring the current lack of knowledge about the biodiversity and ecology of these mysterious sharks.

  12. The Planteome database: an integrated resource for reference ontologies, plant genomics and phenomics

    Science.gov (United States)

    Cooper, Laurel; Meier, Austin; Laporte, Marie-Angélique; Elser, Justin L; Mungall, Chris; Sinn, Brandon T; Cavaliere, Dario; Carbon, Seth; Dunn, Nathan A; Smith, Barry; Qu, Botong; Preece, Justin; Zhang, Eugene; Todorovic, Sinisa; Gkoutos, Georgios; Doonan, John H; Stevenson, Dennis W; Arnaud, Elizabeth

    2018-01-01

    Abstract The Planteome project (http://www.planteome.org) provides a suite of reference and species-specific ontologies for plants and annotations to genes and phenotypes. Ontologies serve as common standards for semantic integration of a large and growing corpus of plant genomics, phenomics and genetics data. The reference ontologies include the Plant Ontology, Plant Trait Ontology and the Plant Experimental Conditions Ontology developed by the Planteome project, along with the Gene Ontology, Chemical Entities of Biological Interest, Phenotype and Attribute Ontology, and others. The project also provides access to species-specific Crop Ontologies developed by various plant breeding and research communities from around the world. We provide integrated data on plant traits, phenotypes, and gene function and expression from 95 plant taxa, annotated with reference ontology terms. The Planteome project is developing a plant gene annotation platform; Planteome Noctua, to facilitate community engagement. All the Planteome ontologies are publicly available and are maintained at the Planteome GitHub site (https://github.com/Planteome) for sharing, tracking revisions and new requests. The annotated data are freely accessible from the ontology browser (http://browser.planteome.org/amigo) and our data repository. PMID:29186578

  13. Placental gene-expression profiles of intrahepatic cholestasis of pregnancy reveal involvement of multiple molecular pathways in blood vessel formation and inflammation.

    Science.gov (United States)

    Du, QiaoLing; Pan, YouDong; Zhang, YouHua; Zhang, HaiLong; Zheng, YaJuan; Lu, Ling; Wang, JunLei; Duan, Tao; Chen, JianFeng

    2014-07-07

    Intrahepatic cholestasis of pregnancy (ICP) is a pregnancy-associated liver disease with potentially deleterious consequences for the fetus, particularly when maternal serum bile-acid concentration >40 μM. However, the etiology and pathogenesis of ICP remain elusive. To reveal the underlying molecular mechanisms for the association of maternal serum bile-acid level and fetal outcome in ICP patients, DNA microarray was applied to characterize the whole-genome expression profiles of placentas from healthy women and women diagnosed with ICP. Thirty pregnant women recruited in this study were categorized evenly into three groups: healthy group; mild ICP, with serum bile-acid concentration ranging from 10-40 μM; and severe ICP, with bile-acid concentration >40 μM. Gene Ontology analysis in combination with construction of gene-interaction and gene co-expression networks were applied to identify the core regulatory genes associated with ICP pathogenesis, which were further validated by quantitative real-time PCR and histological staining. The core regulatory genes were mainly involved in immune response, VEGF signaling pathway and G-protein-coupled receptor signaling, implying essential roles of immune response, vasculogenesis and angiogenesis in ICP pathogenesis. This implication was supported by the observed aggregated immune-cell infiltration and deficient blood vessel formation in ICP placentas. Our study provides a system-level insight into the placental gene-expression profiles of women with mild or severe ICP, and reveals multiple molecular pathways in immune response and blood vessel formation that might contribute to ICP pathogenesis.

  14. Gene organization in rice revealed by full-length cDNA mapping and gene expression analysis through microarray.

    Directory of Open Access Journals (Sweden)

    Kouji Satoh

    Full Text Available Rice (Oryza sativa L. is a model organism for the functional genomics of monocotyledonous plants since the genome size is considerably smaller than those of other monocotyledonous plants. Although highly accurate genome sequences of indica and japonica rice are available, additional resources such as full-length complementary DNA (FL-cDNA sequences are also indispensable for comprehensive analyses of gene structure and function. We cross-referenced 28.5K individual loci in the rice genome defined by mapping of 578K FL-cDNA clones with the 56K loci predicted in the TIGR genome assembly. Based on the annotation status and the presence of corresponding cDNA clones, genes were classified into 23K annotated expressed (AE genes, 33K annotated non-expressed (ANE genes, and 5.5K non-annotated expressed (NAE genes. We developed a 60mer oligo-array for analysis of gene expression from each locus. Analysis of gene structures and expression levels revealed that the general features of gene structure and expression of NAE and ANE genes were considerably different from those of AE genes. The results also suggested that the cloning efficiency of rice FL-cDNA is associated with the transcription activity of the corresponding genetic locus, although other factors may also have an effect. Comparison of the coverage of FL-cDNA among gene families suggested that FL-cDNA from genes encoding rice- or eukaryote-specific domains, and those involved in regulatory functions were difficult to produce in bacterial cells. Collectively, these results indicate that rice genes can be divided into distinct groups based on transcription activity and gene structure, and that the coverage bias of FL-cDNA clones exists due to the incompatibility of certain eukaryotic genes in bacteria.

  15. Perspectives on ontology learning

    CERN Document Server

    Lehmann, J

    2014-01-01

    Perspectives on Ontology Learning brings together researchers and practitioners from different communities − natural language processing, machine learning, and the semantic web − in order to give an interdisciplinary overview of recent advances in ontology learning.Starting with a comprehensive introduction to the theoretical foundations of ontology learning methods, the edited volume presents the state-of-the-start in automated knowledge acquisition and maintenance. It outlines future challenges in this area with a special focus on technologies suitable for pushing the boundaries beyond the c

  16. Constraints on genome dynamics revealed from gene distribution among the Ralstonia solanacearum species.

    Directory of Open Access Journals (Sweden)

    Pierre Lefeuvre

    Full Text Available Because it is suspected that gene content may partly explain host adaptation and ecology of pathogenic bacteria, it is important to study factors affecting genome composition and its evolution. While recent genomic advances have revealed extremely large pan-genomes for some bacterial species, it remains difficult to predict to what extent gene pool is accessible within or transferable between populations. As genomes bear imprints of the history of the organisms, gene distribution pattern analyses should provide insights into the forces and factors at play in the shaping and maintaining of bacterial genomes. In this study, we revisited the data obtained from a previous CGH microarrays analysis in order to assess the genomic plasticity of the R. solanacearum species complex. Gene distribution analyses demonstrated the remarkably dispersed genome of R. solanacearum with more than half of the genes being accessory. From the reconstruction of the ancestral genomes compositions, we were able to infer the number of gene gain and loss events along the phylogeny. Analyses of gene movement patterns reveal that factors associated with gene function, genomic localization and ecology delineate gene flow patterns. While the chromosome displayed lower rates of movement, the megaplasmid was clearly associated with hot-spots of gene gain and loss. Gene function was also confirmed to be an essential factor in gene gain and loss dynamics with significant differences in movement patterns between different COG categories. Finally, analyses of gene distribution highlighted possible highways of horizontal gene transfer. Due to sampling and design bias, we can only speculate on factors at play in this gene movement dynamic. Further studies examining precise conditions that favor gene transfer would provide invaluable insights in the fate of bacteria, species delineation and the emergence of successful pathogens.

  17. Data mining for ontology development.

    Energy Technology Data Exchange (ETDEWEB)

    Davidson, George S.; Strasburg, Jana (Pacific Northwest National Laboratory, Richland, WA); Stampf, David (Brookhaven National Laboratory, Upton, NY); Neymotin,Lev (Brookhaven National Laboratory, Upton, NY); Czajkowski, Carl (Brookhaven National Laboratory, Upton, NY); Shine, Eugene (Savannah River National Laboratory, Aiken, SC); Bollinger, James (Savannah River National Laboratory, Aiken, SC); Ghosh, Vinita (Brookhaven National Laboratory, Upton, NY); Sorokine, Alexandre (Oak Ridge National Laboratory, Oak Ridge, TN); Ferrell, Regina (Oak Ridge National Laboratory, Oak Ridge, TN); Ward, Richard (Oak Ridge National Laboratory, Oak Ridge, TN); Schoenwald, David Alan

    2010-06-01

    A multi-laboratory ontology construction effort during the summer and fall of 2009 prototyped an ontology for counterfeit semiconductor manufacturing. This effort included an ontology development team and an ontology validation methods team. Here the third team of the Ontology Project, the Data Analysis (DA) team reports on their approaches, the tools they used, and results for mining literature for terminology pertinent to counterfeit semiconductor manufacturing. A discussion of the value of ontology-based analysis is presented, with insights drawn from other ontology-based methods regularly used in the analysis of genomic experiments. Finally, suggestions for future work are offered.

  18. Dual gene activation and knockout screen reveals directional dependencies in genetic networks. | Office of Cancer Genomics

    Science.gov (United States)

    Understanding the direction of information flow is essential for characterizing how genetic networks affect phenotypes. However, methods to find genetic interactions largely fail to reveal directional dependencies. We combine two orthogonal Cas9 proteins from Streptococcus pyogenes and Staphylococcus aureus to carry out a dual screen in which one gene is activated while a second gene is deleted in the same cell. We analyze the quantitative effects of activation and knockout to calculate genetic interaction and directionality scores for each gene pair.

  19. Gene expression profiles reveal key genes for early diagnosis and treatment of adamantinomatous craniopharyngioma.

    Science.gov (United States)

    Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing

    2018-04-23

    Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis

  20. ``Force,'' ontology, and language

    Science.gov (United States)

    Brookes, David T.; Etkina, Eugenia

    2009-06-01

    We introduce a linguistic framework through which one can interpret systematically students’ understanding of and reasoning about force and motion. Some researchers have suggested that students have robust misconceptions or alternative frameworks grounded in everyday experience. Others have pointed out the inconsistency of students’ responses and presented a phenomenological explanation for what is observed, namely, knowledge in pieces. We wish to present a view that builds on and unifies aspects of this prior research. Our argument is that many students’ difficulties with force and motion are primarily due to a combination of linguistic and ontological difficulties. It is possible that students are primarily engaged in trying to define and categorize the meaning of the term “force” as spoken about by physicists. We found that this process of negotiation of meaning is remarkably similar to that engaged in by physicists in history. In this paper we will describe a study of the historical record that reveals an analogous process of meaning negotiation, spanning multiple centuries. Using methods from cognitive linguistics and systemic functional grammar, we will present an analysis of the force and motion literature, focusing on prior studies with interview data. We will then discuss the implications of our findings for physics instruction.

  1. Ontology of fractures

    Science.gov (United States)

    Zhong, Jian; Aydina, Atilla; McGuinness, Deborah L.

    2009-03-01

    Fractures are fundamental structures in the Earth's crust and they can impact many societal and industrial activities including oil and gas exploration and production, aquifer management, CO 2 sequestration, waste isolation, the stabilization of engineering structures, and assessing natural hazards (earthquakes, volcanoes, and landslides). Therefore, an ontology which organizes the concepts of fractures could help facilitate a sound education within, and communication among, the highly diverse professional and academic community interested in the problems cited above. We developed a process-based ontology that makes explicit specifications about fractures, their properties, and the deformation mechanisms which lead to their formation and evolution. Our ontology emphasizes the relationships among concepts such as the factors that influence the mechanism(s) responsible for the formation and evolution of specific fracture types. Our ontology is a valuable resource with a potential to applications in a number of fields utilizing recent advances in Information Technology, specifically for digital data and information in computers, grids, and Web services.

  2. A Method for Evaluating and Standardizing Ontologies

    Science.gov (United States)

    Seyed, Ali Patrice

    2012-01-01

    The Open Biomedical Ontology (OBO) Foundry initiative is a collaborative effort for developing interoperable, science-based ontologies. The Basic Formal Ontology (BFO) serves as the upper ontology for the domain-level ontologies of OBO. BFO is an upper ontology of types as conceived by defenders of realism. Among the ontologies developed for OBO…

  3. Manufacturing ontology through templates

    Directory of Open Access Journals (Sweden)

    Diciuc Vlad

    2017-01-01

    Full Text Available The manufacturing industry contains a high volume of knowhow and of high value, much of it being held by key persons in the company. The passing of this know-how is the basis of manufacturing ontology. Among other methods like advanced filtering and algorithm based decision making, one way of handling the manufacturing ontology is via templates. The current paper tackles this approach and highlights the advantages concluding with some recommendations.

  4. The Electronic Notebook Ontology

    OpenAIRE

    Chalk, Stuart

    2016-01-01

    Science is rapidly being brought into the electronic realm and electronic laboratory notebooks (ELN) are a big part of this activity. The representation of the scientific process in the context of an ELN is an important component to making the data recorded in ELNs semantically integrated. This presentation will outline initial developments of an Electronic Notebook Ontology (ENO) that will help tie together the ExptML ontology, HCLS Community Profile data descriptions, and the VIVO-ISF ontol...

  5. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  6. Relaxation rates of gene expression kinetics reveal the feedback signs of autoregulatory gene networks

    Science.gov (United States)

    Jia, Chen; Qian, Hong; Chen, Min; Zhang, Michael Q.

    2018-03-01

    The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression.

  7. Ontology Update in the Cognitive Model of Ontology Learning

    Directory of Open Access Journals (Sweden)

    Zhang De-Hai

    2016-01-01

    Full Text Available Ontology has been used in many hot-spot fields, but most ontology construction methods are semiautomatic, and the construction process of ontology is still a tedious and painstaking task. In this paper, a kind of cognitive models is presented for ontology learning which can simulate human being’s learning from world. In this model, the cognitive strategies are applied with the constrained axioms. Ontology update is a key step when the new knowledge adds into the existing ontology and conflict with old knowledge in the process of ontology learning. This proposal designs and validates the method of ontology update based on the axiomatic cognitive model, which include the ontology update postulates, axioms and operations of the learning model. It is proved that these operators subject to the established axiom system.

  8. Concurrent growth rate and transcript analyses reveal essential gene stringency in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Shan Goh

    Full Text Available BACKGROUND: Genes essential for bacterial growth are of particular scientific interest. Many putative essential genes have been identified or predicted in several species, however, little is known about gene expression requirement stringency, which may be an important aspect of bacterial physiology and likely a determining factor in drug target development. METHODOLOGY/PRINCIPAL FINDINGS: Working from the premise that essential genes differ in absolute requirement for growth, we describe silencing of putative essential genes in E. coli to obtain a titration of declining growth rates and transcript levels by using antisense peptide nucleic acids (PNA and expressed antisense RNA. The relationship between mRNA decline and growth rate decline reflects the degree of essentiality, or stringency, of an essential gene, which is here defined by the minimum transcript level for a 50% reduction in growth rate (MTL(50. When applied to four growth essential genes, both RNA silencing methods resulted in MTL(50 values that reveal acpP as the most stringently required of the four genes examined, with ftsZ the next most stringently required. The established antibacterial targets murA and fabI were less stringently required. CONCLUSIONS: RNA silencing can reveal stringent requirements for gene expression with respect to growth. This method may be used to validate existing essential genes and to quantify drug target requirement.

  9. Recent adaptive events in human brain revealed by meta-analysis of positively selected genes.

    Directory of Open Access Journals (Sweden)

    Yue Huang

    Full Text Available BACKGROUND AND OBJECTIVES: Analysis of positively-selected genes can help us understand how human evolved, especially the evolution of highly developed cognitive functions. However, previous works have reached conflicting conclusions regarding whether human neuronal genes are over-represented among genes under positive selection. METHODS AND RESULTS: We divided positively-selected genes into four groups according to the identification approaches, compiling a comprehensive list from 27 previous studies. We showed that genes that are highly expressed in the central nervous system are enriched in recent positive selection events in human history identified by intra-species genomic scan, especially in brain regions related to cognitive functions. This pattern holds when different datasets, parameters and analysis pipelines were used. Functional category enrichment analysis supported these findings, showing that synapse-related functions are enriched in genes under recent positive selection. In contrast, immune-related functions, for instance, are enriched in genes under ancient positive selection revealed by inter-species coding region comparison. We further demonstrated that most of these patterns still hold even after controlling for genomic characteristics that might bias genome-wide identification of positively-selected genes including gene length, gene density, GC composition, and intensity of negative selection. CONCLUSION: Our rigorous analysis resolved previous conflicting conclusions and revealed recent adaptation of human brain functions.

  10. Ontology Design Patterns for Combining Pathology and Anatomy: Application to Study Aging and Longevity in Inbred Mouse Strains

    KAUST Repository

    Alghamdi, Sarah M.

    2018-01-01

    To evaluate the generated ontologies, we utilize these in ontology-based data analysis, including ontology enrichment analysis and computation of semantic similarity. We demonstrate that there are significant differences between the four ontologies in different analysis approaches. In addition, when using semantic similarity to confirm the hypothesis that genetically identical mice should develop more similar diseases, the generated combined ontologies lead to significantly better analysis results compared to using each ontology individually. Our results reveal that using ontology design patterns to combine different facets characterizing a dataset can improve established analysis methods.

  11. Comparative Genomic Analysis of Transgenic Poplar Dwarf Mutant Reveals Numerous Differentially Expressed Genes Involved in Energy Flow

    Directory of Open Access Journals (Sweden)

    Su Chen

    2014-09-01

    Full Text Available In our previous research, the Tamarix androssowii LEA gene (Tamarix androssowii late embryogenesis abundant protein Mrna, GenBank ID: DQ663481 was transferred into Populus simonii × Populus nigra. Among the eleven transgenic lines, one exhibited a dwarf phenotype compared to the wild type and other transgenic lines, named dwf1. To uncover the mechanisms underlying this phenotype, digital gene expression libraries were produced from dwf1, wild-type, and other normal transgenic lines, XL-5 and XL-6. Gene expression profile analysis indicated that dwf1 had a unique gene expression pattern in comparison to the other two transgenic lines. Finally, a total of 1246 dwf1-unique differentially expressed genes were identified. These genes were further subjected to gene ontology and pathway analysis. Results indicated that photosynthesis and carbohydrate metabolism related genes were significantly affected. In addition, many transcription factors genes were also differentially expressed in dwf1. These various differentially expressed genes may be critical for dwarf mutant formation; thus, the findings presented here might provide insight for our understanding of the mechanisms of tree growth and development.

  12. Systematic Prioritization and Integrative Analysis of Copy Number Variations in Schizophrenia Reveal Key Schizophrenia Susceptibility Genes

    Science.gov (United States)

    Luo, Xiongjian; Huang, Liang; Han, Leng; Luo, Zhenwu; Hu, Fang; Tieu, Roger; Gan, Lin

    2014-01-01

    Schizophrenia is a common mental disorder with high heritability and strong genetic heterogeneity. Common disease-common variants hypothesis predicts that schizophrenia is attributable in part to common genetic variants. However, recent studies have clearly demonstrated that copy number variations (CNVs) also play pivotal roles in schizophrenia susceptibility and explain a proportion of missing heritability. Though numerous CNVs have been identified, many of the regions affected by CNVs show poor overlapping among different studies, and it is not known whether the genes disrupted by CNVs contribute to the risk of schizophrenia. By using cumulative scoring, we systematically prioritized the genes affected by CNVs in schizophrenia. We identified 8 top genes that are frequently disrupted by CNVs, including NRXN1, CHRNA7, BCL9, CYFIP1, GJA8, NDE1, SNAP29, and GJA5. Integration of genes affected by CNVs with known schizophrenia susceptibility genes (from previous genetic linkage and association studies) reveals that many genes disrupted by CNVs are also associated with schizophrenia. Further protein-protein interaction (PPI) analysis indicates that protein products of genes affected by CNVs frequently interact with known schizophrenia-associated proteins. Finally, systematic integration of CNVs prioritization data with genetic association and PPI data identifies key schizophrenia candidate genes. Our results provide a global overview of genes impacted by CNVs in schizophrenia and reveal a densely interconnected molecular network of de novo CNVs in schizophrenia. Though the prioritized top genes represent promising schizophrenia risk genes, further work with different prioritization methods and independent samples is needed to confirm these findings. Nevertheless, the identified key candidate genes may have important roles in the pathogenesis of schizophrenia, and further functional characterization of these genes may provide pivotal targets for future therapeutics and

  13. The Mapping of Predicted Triplex DNA:RNA in the Drosophila Genome Reveals a Prominent Location in Development- and Morphogenesis-Related Genes

    Directory of Open Access Journals (Sweden)

    Claude Pasquier

    2017-07-01

    Full Text Available Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand. A nucleic acid triplex occurs according to Hoogsteen rules that predict the stability and affinity of the third strand bound to the Watson–Crick duplex. The “triplex-forming oligonucleotide” (TFO can be a short sequence of RNA that binds to the major groove of the targeted duplex only when this duplex presents a sequence of purine or pyrimidine bases in one of the DNA strands. Many nuclear proteins are known to bind triplex DNA or DNA:RNA, but their biological functions are unexplored. We identified sequences that are capable of engaging as the “triplex-forming oligonucleotide” in both the pre-lncRNA and pre-mRNA collections of Drosophila melanogaster. These motifs were matched against the Drosophila genome in order to identify putative sequences of triplex formation in intergenic regions, promoters, and introns/exons. Most of the identified TFOs appear to be located in the intronic region of the analyzed genes. Computational prediction of the most targeted genes by TFOs originating from pre-lncRNAs and pre-mRNAs revealed that they are restrictively associated with development- and morphogenesis-related gene networks. The refined analysis by Gene Ontology enrichment demonstrates that some individual TFOs present genome-wide scale matches that are located in numerous genes and regulatory sequences. The triplex DNA:RNA computational mapping at the genome-wide scale suggests broad interference in the regulatory process of the gene networks orchestrated by TFO RNAs acting in association simultaneously at multiple sites.

  14. Cross-species transcriptomic approach reveals genes in hamster implantation sites.

    Science.gov (United States)

    Lei, Wei; Herington, Jennifer; Galindo, Cristi L; Ding, Tianbing; Brown, Naoko; Reese, Jeff; Paria, Bibhash C

    2014-12-01

    The mouse model has greatly contributed to understanding molecular mechanisms involved in the regulation of progesterone (P4) plus estrogen (E)-dependent blastocyst implantation process. However, little is known about contributory molecular mechanisms of the P4-only-dependent blastocyst implantation process that occurs in species such as hamsters, guineapigs, rabbits, pigs, rhesus monkeys, and perhaps humans. We used the hamster as a model of P4-only-dependent blastocyst implantation and carried out cross-species microarray (CSM) analyses to reveal differentially expressed genes at the blastocyst implantation site (BIS), in order to advance the understanding of molecular mechanisms of implantation. Upregulation of 112 genes and downregulation of 77 genes at the BIS were identified using a mouse microarray platform, while use of the human microarray revealed 62 up- and 38 down-regulated genes at the BIS. Excitingly, a sizable number of genes (30 up- and 11 down-regulated genes) were identified as a shared pool by both CSMs. Real-time RT-PCR and in situ hybridization validated the expression patterns of several up- and down-regulated genes identified by both CSMs at the hamster and mouse BIS to demonstrate the merit of CSM findings across species, in addition to revealing genes specific to hamsters. Functional annotation analysis found that genes involved in the spliceosome, proteasome, and ubiquination pathways are enriched at the hamster BIS, while genes associated with tight junction, SAPK/JNK signaling, and PPARα/RXRα signalings are repressed at the BIS. Overall, this study provides a pool of genes and evidence of their participation in up- and down-regulated cellular functions/pathways at the hamster BIS. © 2014 Society for Reproduction and Fertility.

  15. Mining rare associations between biological ontologies.

    Science.gov (United States)

    Benites, Fernando; Simon, Svenja; Sapozhnikova, Elena

    2014-01-01

    The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations.

  16. Mining rare associations between biological ontologies.

    Directory of Open Access Journals (Sweden)

    Fernando Benites

    Full Text Available The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations.

  17. Identification of horizontally transferred genes in the genus Colletotrichum reveals a steady tempo of bacterial to fungal gene transfer.

    Science.gov (United States)

    Jaramillo, Vinicio D Armijos; Sukno, Serenella A; Thon, Michael R

    2015-01-02

    Horizontal gene transfer (HGT) is the stable transmission of genetic material between organisms by means other than vertical inheritance. HGT has an important role in the evolution of prokaryotes but is relatively rare in eukaryotes. HGT has been shown to contribute to virulence in eukaryotic pathogens. We studied the importance of HGT in plant pathogenic fungi by identifying horizontally transferred genes in the genomes of three members of the genus Colletotrichum. We identified eleven HGT events from bacteria into members of the genus Colletotrichum or their ancestors. The HGT events include genes involved in amino acid, lipid and sugar metabolism as well as lytic enzymes. Additionally, the putative minimal dates of transference were calculated using a time calibrated phylogenetic tree. This analysis reveals a constant flux of genes from bacteria to fungi throughout the evolution of subphylum Pezizomycotina. Genes that are typically transferred by HGT are those that are constantly subject to gene duplication and gene loss. The functions of some of these genes suggest roles in niche adaptation and virulence. We found no evidence of a burst of HGT events coinciding with major geological events. In contrast, HGT appears to be a constant, albeit rare phenomenon in the Pezizomycotina, occurring at a steady rate during their evolution.

  18. DeMO: An Ontology for Discrete-event Modeling and Simulation

    Science.gov (United States)

    Silver, Gregory A; Miller, John A; Hybinette, Maria; Baramidze, Gregory; York, William S

    2011-01-01

    Several fields have created ontologies for their subdomains. For example, the biological sciences have developed extensive ontologies such as the Gene Ontology, which is considered a great success. Ontologies could provide similar advantages to the Modeling and Simulation community. They provide a way to establish common vocabularies and capture knowledge about a particular domain with community-wide agreement. Ontologies can support significantly improved (semantic) search and browsing, integration of heterogeneous information sources, and improved knowledge discovery capabilities. This paper discusses the design and development of an ontology for Modeling and Simulation called the Discrete-event Modeling Ontology (DeMO), and it presents prototype applications that demonstrate various uses and benefits that such an ontology may provide to the Modeling and Simulation community. PMID:22919114

  19. Gene expression profiling in equine polysaccharide storage myopathy revealed inflammation, glycogenesis inhibition, hypoxia and mitochondrial dysfunctions

    Directory of Open Access Journals (Sweden)

    Benech Philippe

    2009-08-01

    Full Text Available Abstract Background Several cases of myopathies have been observed in the horse Norman Cob breed. Muscle histology examinations revealed that some families suffer from a polysaccharide storage myopathy (PSSM. It is assumed that a gene expression signature related to PSSM should be observed at the transcriptional level because the glycogen storage disease could also be linked to other dysfunctions in gene regulation. Thus, the functional genomic approach could be conducted in order to provide new knowledge about the metabolic disorders related to PSSM. We propose exploring the PSSM muscle fiber metabolic disorders by measuring gene expression in relationship with the histological phenotype. Results Genotypying analysis of GYS1 mutation revealed 2 homozygous (AA and 5 heterozygous (GA PSSM horses. In the PSSM muscles, histological data revealed PAS positive amylase resistant abnormal polysaccharides, inflammation, necrosis, and lipomatosis and active regeneration of fibers. Ultrastructural evaluation revealed a decrease of mitochondrial number and structural disorders. Extensive accumulation of an abnormal polysaccharide displaced and partially replaced mitochondria and myofibrils. The severity of the disease was higher in the two homozygous PSSM horses. Gene expression analysis revealed 129 genes significantly modulated (p Conclusion The main disorders observed in PSSM muscles could be related to mitochondrial dysfunctions, glycogenesis inhibition and the chronic hypoxia of the PSSM muscles.

  20. Gene expression profiling of canine osteosarcoma reveals genes associated with short and long survival times

    Directory of Open Access Journals (Sweden)

    Rao Nagesha AS

    2009-09-01

    Full Text Available Abstract Background Gene expression profiling of spontaneous tumors in the dog offers a unique translational opportunity to identify prognostic biomarkers and signaling pathways that are common to both canine and human. Osteosarcoma (OS accounts for approximately 80% of all malignant bone tumors in the dog. Canine OS are highly comparable with their human counterpart with respect to histology, high metastatic rate and poor long-term survival. This study investigates the prognostic gene profile among thirty-two primary canine OS using canine specific cDNA microarrays representing 20,313 genes to identify genes and cellular signaling pathways associated with survival. This, the first report of its kind in dogs with OS, also demonstrates the advantages of cross-species comparison with human OS. Results The 32 tumors were classified into two prognostic groups based on survival time (ST. They were defined as short survivors (dogs with poor prognosis: surviving fewer than 6 months and long survivors (dogs with better prognosis: surviving 6 months or longer. Fifty-one transcripts were found to be differentially expressed, with common upregulation of these genes in the short survivors. The overexpressed genes in short survivors are associated with possible roles in proliferation, drug resistance or metastasis. Several deregulated pathways identified in the present study, including Wnt signaling, Integrin signaling and Chemokine/cytokine signaling are comparable to the pathway analysis conducted on human OS gene profiles, emphasizing the value of the dog as an excellent model for humans. Conclusion A molecular-based method for discrimination of outcome for short and long survivors is useful for future prognostic stratification at initial diagnosis, where genes and pathways associated with cell cycle/proliferation, drug resistance and metastasis could be potential targets for diagnosis and therapy. The similarities between human and canine OS makes the

  1. Annotating the human genome with Disease Ontology

    Science.gov (United States)

    Osborne, John D; Flatow, Jared; Holko, Michelle; Lin, Simon M; Kibbe, Warren A; Zhu, Lihua (Julie); Danila, Maria I; Feng, Gang; Chisholm, Rex L

    2009-01-01

    Background The human genome has been extensively annotated with Gene Ontology for biological functions, but minimally computationally annotated for diseases. Results We used the Unified Medical Language System (UMLS) MetaMap Transfer tool (MMTx) to discover gene-disease relationships from the GeneRIF database. We utilized a comprehensive subset of UMLS, which is disease-focused and structured as a directed acyclic graph (the Disease Ontology), to filter and interpret results from MMTx. The results were validated against the Homayouni gene collection using recall and precision measurements. We compared our results with the widely used Online Mendelian Inheritance in Man (OMIM) annotations. Conclusion The validation data set suggests a 91% recall rate and 97% precision rate of disease annotation using GeneRIF, in contrast with a 22% recall and 98% precision using OMIM. Our thesaurus-based approach allows for comparisons to be made between disease containing databases and allows for increased accuracy in disease identification through synonym matching. The much higher recall rate of our approach demonstrates that annotating human genome with Disease Ontology and GeneRIF for diseases dramatically increases the coverage of the disease annotation of human genome. PMID:19594883

  2. Sexually Dimorphic Gene Expression Associated with Growth and Reproduction of Tongue Sole (Cynoglossus semilaevis) Revealed by Brain Transcriptome Analysis.

    Science.gov (United States)

    Wang, Pingping; Zheng, Min; Liu, Jian; Liu, Yongzhuang; Lu, Jianguo; Sun, Xiaowen

    2016-08-26

    In this study, we performed a comprehensive analysis of the transcriptome of one- and two-year-old male and female brains of Cynoglossus semilaevis by high-throughput Illumina sequencing. A total of 77,066 transcripts, corresponding to 21,475 unigenes, were obtained with a N50 value of 4349 bp. Of these unigenes, 33 genes were found to have significant differential expression and potentially associated with growth, from which 18 genes were down-regulated and 12 genes were up-regulated in two-year-old males, most of these genes had no significant differences in expression among one-year-old males and females and two-year-old females. A similar analysis was conducted to look for genes associated with reproduction; 25 genes were identified, among them, five genes were found to be down regulated and 20 genes up regulated in two-year-old males, again, most of the genes had no significant expression differences among the other three. The performance of up regulated genes in Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was significantly different between two-year-old males and females. Males had a high gene expression in genetic information processing, while female's highly expressed genes were mainly enriched on organismal systems. Our work identified a set of sex-biased genes potentially associated with growth and reproduction that might be the candidate factors affecting sexual dimorphism of tongue sole, laying the foundation to understand the complex process of sex determination of this economic valuable species.

  3. Sexually Dimorphic Gene Expression Associated with Growth and Reproduction of Tongue Sole (Cynoglossus semilaevis Revealed by Brain Transcriptome Analysis

    Directory of Open Access Journals (Sweden)

    Pingping Wang

    2016-08-01

    Full Text Available In this study, we performed a comprehensive analysis of the transcriptome of one- and two-year-old male and female brains of Cynoglossus semilaevis by high-throughput Illumina sequencing. A total of 77,066 transcripts, corresponding to 21,475 unigenes, were obtained with a N50 value of 4349 bp. Of these unigenes, 33 genes were found to have significant differential expression and potentially associated with growth, from which 18 genes were down-regulated and 12 genes were up-regulated in two-year-old males, most of these genes had no significant differences in expression among one-year-old males and females and two-year-old females. A similar analysis was conducted to look for genes associated with reproduction; 25 genes were identified, among them, five genes were found to be down regulated and 20 genes up regulated in two-year-old males, again, most of the genes had no significant expression differences among the other three. The performance of up regulated genes in Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enrichment analysis was significantly different between two-year-old males and females. Males had a high gene expression in genetic information processing, while female’s highly expressed genes were mainly enriched on organismal systems. Our work identified a set of sex-biased genes potentially associated with growth and reproduction that might be the candidate factors affecting sexual dimorphism of tongue sole, laying the foundation to understand the complex process of sex determination of this economic valuable species.

  4. OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    Directory of Open Access Journals (Sweden)

    Johnson Helen L

    2008-01-01

    Full Text Available Abstract Background Information extraction (IE efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport, protein-protein interaction assertions (interaction and assertions that a gene is expressed in a cell type (expression. Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85. Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for

  5. QTL mapping and transcriptome analysis of cowpea reveals candidate genes for root-knot nematode resistance.

    Science.gov (United States)

    Santos, Jansen Rodrigo Pereira; Ndeve, Arsenio Daniel; Huynh, Bao-Lam; Matthews, William Charles; Roberts, Philip Alan

    2018-01-01

    Cowpea is one of the most important food and forage legumes in drier regions of the tropics and subtropics. However, cowpea yield worldwide is markedly below the known potential due to abiotic and biotic stresses, including parasitism by root-knot nematodes (Meloidogyne spp., RKN). Two resistance genes with dominant effect, Rk and Rk2, have been reported to provide resistance against RKN in cowpea. Despite their description and use in breeding for resistance to RKN and particularly genetic mapping of the Rk locus, the exact genes conferring resistance to RKN remain unknown. In the present work, QTL mapping using recombinant inbred line (RIL) population 524B x IT84S-2049 segregating for a newly mapped locus and analysis of the transcriptome changes in two cowpea near-isogenic lines (NIL) were used to identify candidate genes for Rk and the newly mapped locus. A major QTL, designated QRk-vu9.1, associated with resistance to Meloidogyne javanica reproduction, was detected and mapped on linkage group LG9 at position 13.37 cM using egg production data. Transcriptome analysis on resistant and susceptible NILs 3 and 9 days after inoculation revealed up-regulation of 109 and 98 genes and down-regulation of 110 and 89 genes, respectively, out of 19,922 unique genes mapped to the common bean reference genome. Among the differentially expressed genes, four and nine genes were found within the QRk-vu9.1 and QRk-vu11.1 QTL intervals, respectively. Six of these genes belong to the TIR-NBS-LRR family of resistance genes and three were upregulated at one or more time-points. Quantitative RT-PCR validated gene expression to be positively correlated with RNA-seq expression pattern for eight genes. Future functional analysis of these cowpea genes will enhance our understanding of Rk-mediated resistance and identify the specific gene responsible for the resistance.

  6. QTL mapping and transcriptome analysis of cowpea reveals candidate genes for root-knot nematode resistance.

    Directory of Open Access Journals (Sweden)

    Jansen Rodrigo Pereira Santos

    Full Text Available Cowpea is one of the most important food and forage legumes in drier regions of the tropics and subtropics. However, cowpea yield worldwide is markedly below the known potential due to abiotic and biotic stresses, including parasitism by root-knot nematodes (Meloidogyne spp., RKN. Two resistance genes with dominant effect, Rk and Rk2, have been reported to provide resistance against RKN in cowpea. Despite their description and use in breeding for resistance to RKN and particularly genetic mapping of the Rk locus, the exact genes conferring resistance to RKN remain unknown. In the present work, QTL mapping using recombinant inbred line (RIL population 524B x IT84S-2049 segregating for a newly mapped locus and analysis of the transcriptome changes in two cowpea near-isogenic lines (NIL were used to identify candidate genes for Rk and the newly mapped locus. A major QTL, designated QRk-vu9.1, associated with resistance to Meloidogyne javanica reproduction, was detected and mapped on linkage group LG9 at position 13.37 cM using egg production data. Transcriptome analysis on resistant and susceptible NILs 3 and 9 days after inoculation revealed up-regulation of 109 and 98 genes and down-regulation of 110 and 89 genes, respectively, out of 19,922 unique genes mapped to the common bean reference genome. Among the differentially expressed genes, four and nine genes were found within the QRk-vu9.1 and QRk-vu11.1 QTL intervals, respectively. Six of these genes belong to the TIR-NBS-LRR family of resistance genes and three were upregulated at one or more time-points. Quantitative RT-PCR validated gene expression to be positively correlated with RNA-seq expression pattern for eight genes. Future functional analysis of these cowpea genes will enhance our understanding of Rk-mediated resistance and identify the specific gene responsible for the resistance.

  7. Enhanced Botrytis cinerea resistance of Arabidopsis plants grown in compost may be explained by increased expression of defense-related genes, as revealed by microarray analysis.

    Directory of Open Access Journals (Sweden)

    Guillem Segarra

    Full Text Available Composts are the products obtained after the aerobic degradation of different types of organic matter waste and can be used as substrates or substrate/soil amendments for plant cultivation. There is a small but increasing number of reports that suggest that foliar diseases may be reduced when using compost, rather than standard substrates, as growing medium. The purpose of this study was to examine the gene expression alteration produced by the compost to gain knowledge of the mechanisms involved in compost-induced systemic resistance. A compost from olive marc and olive tree leaves was able to induce resistance against Botrytis cinerea in Arabidopsis, unlike the standard substrate, perlite. Microarray analyses revealed that 178 genes were differently expressed, with a fold change cut-off of 1, of which 155 were up-regulated and 23 were down-regulated in compost-grown, as against perlite-grown plants. A functional enrichment study of up-regulated genes revealed that 38 Gene Ontology terms were significantly enriched. Response to stress, biotic stimulus, other organism, bacterium, fungus, chemical and abiotic stimulus, SA and ABA stimulus, oxidative stress, water, temperature and cold were significantly enriched, as were immune and defense responses, systemic acquired resistance, secondary metabolic process and oxireductase activity. Interestingly, PR1 expression, which was equally enhanced by growing the plants in compost and by B. cinerea inoculation, was further boosted in compost-grown pathogen-inoculated plants. Compost triggered a plant response that shares similarities with both systemic acquired resistance and ABA-dependent/independent abiotic stress responses.

  8. Molecular evolution and diversification of snake toxin genes, revealed by analysis of intron sequences.

    Science.gov (United States)

    Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T

    2003-08-14

    The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.

  9. The gene expression profile of resistant and susceptible Bombyx mori strains reveals cypovirus-associated variations in host gene transcript levels.

    Science.gov (United States)

    Guo, Rui; Wang, Simei; Xue, Renyu; Cao, Guangli; Hu, Xiaolong; Huang, Moli; Zhang, Yangqi; Lu, Yahong; Zhu, Liyuan; Chen, Fei; Liang, Zi; Kuang, Sulan; Gong, Chengliang

    2015-06-01

    High-throughput paired-end RNA sequencing (RNA-Seq) was performed to investigate the gene expression profile of a susceptible Bombyx mori strain, Lan5, and a resistant B. mori strain, Ou17, which were both orally infected with B. mori cypovirus (BmCPV) in the midgut. There were 330 and 218 up-regulated genes, while there were 147 and 260 down-regulated genes in the Lan5 and Ou17 strains, respectively. Gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment for differentially expressed genes (DEGs) were carried out. Moreover, gene interaction network (STRING) analyses were performed to analyze the relationships among the shared DEGs. Some of these genes were related and formed a large network, in which the genes for B. mori cuticular protein RR-2 motif 123 (BmCPR123) and the gene for B. mori DNA replication licensing factor Mcm2-like (BmMCM2) were key genes among the common up-regulated DEGs, whereas the gene for B. mori heat shock protein 20.1 (Bmhsp20.1) was the central gene among the shared down-regulated DEGs between Lan5 vs Lan5-CPV and Ou17 vs Ou17-CPV. These findings established a comprehensive database of genes that are differentially expressed in response to BmCPV infection between silkworm strains that differed in resistance to BmCPV and implied that these DEGs might be involved in B. mori immune responses against BmCPV infection.

  10. Iron homeostasis in Arabidopsis thaliana: transcriptomic analyses reveal novel FIT-regulated genes, iron deficiency marker genes and functional gene networks.

    Science.gov (United States)

    Mai, Hans-Jörg; Pateyron, Stéphanie; Bauer, Petra

    2016-10-03

    FIT (FER-LIKE IRON DEFICIENCY-INDUCED TRANSCRIPTION FACTOR) is the central regulator of iron uptake in Arabidopsis thaliana roots. We performed transcriptome analyses of six day-old seedlings and roots of six week-old plants using wild type, a fit knock-out mutant and a FIT over-expression line grown under iron-sufficient or iron-deficient conditions. We compared genes regulated in a FIT-dependent manner depending on the developmental stage of the plants. We assembled a high likelihood dataset which we used to perform co-expression and functional analysis of the most stably iron deficiency-induced genes. 448 genes were found FIT-regulated. Out of these, 34 genes were robustly FIT-regulated in root and seedling samples and included 13 novel FIT-dependent genes. Three hundred thirty-one genes showed differential regulation in response to the presence and absence of FIT only in the root samples, while this was the case for 83 genes in the seedling samples. We assembled a virtual dataset of iron-regulated genes based on a total of 14 transcriptomic analyses of iron-deficient and iron-sufficient wild-type plants to pinpoint the best marker genes for iron deficiency and analyzed this dataset in depth. Co-expression analysis of this dataset revealed 13 distinct regulons part of which predominantly contained functionally related genes. We could enlarge the list of FIT-dependent genes and discriminate between genes that are robustly FIT-regulated in roots and seedlings or only in one of those. FIT-regulated genes were mostly induced, few of them were repressed by FIT. With the analysis of a virtual dataset we could filter out and pinpoint new candidates among the most reliable marker genes for iron deficiency. Moreover, co-expression and functional analysis of this virtual dataset revealed iron deficiency-induced and functionally distinct regulons.

  11. Genome association study through nonlinear mixed models revealed new candidate genes for pig growth curves

    Directory of Open Access Journals (Sweden)

    Fabyano Fonseca e Silva

    Full Text Available ABSTRACT: Genome association analyses have been successful in identifying quantitative trait loci (QTLs for pig body weights measured at a single age. However, when considering the whole weight trajectories over time in the context of genome association analyses, it is important to look at the markers that affect growth curve parameters. The easiest way to consider them is via the two-step method, in which the growth curve parameters and marker effects are estimated separately, thereby resulting in a reduction of the statistical power and the precision of estimates. One efficient solution is to adopt nonlinear mixed models (NMM, which enables a joint modeling of the individual growth curves and marker effects. Our aim was to propose a genome association analysis for growth curves in pigs based on NMM as well as to compare it with the traditional two-step method. In addition, we also aimed to identify the nearest candidate genes related to significant SNP (single nucleotide polymorphism markers. The NMM presented a higher number of significant SNPs for adult weight (A and maturity rate (K, and provided a direct way to test SNP significance simultaneously for both the A and K parameters. Furthermore, all significant SNPs from the two-step method were also reported in the NMM analysis. The ontology of the three candidate genes (SH3BGRL2, MAPK14, and MYL9 derived from significant SNPs (simultaneously affecting A and K allows us to make inferences with regards to their contribution to the pig growth process in the population studied.

  12. Gene expression profiling reveals candidate genes related to residual feed intake in duodenum of laying ducks.

    Science.gov (United States)

    Zeng, T; Huang, L; Ren, J; Chen, L; Tian, Y; Huang, Y; Zhang, H; Du, J; Lu, L

    2017-12-01

    Feed represents two-thirds of the total costs of poultry production, especially in developing countries. Improvement in feed efficiency would reduce the amount of feed required for production (growth or laying), the production cost, and the amount of nitrogenous waste. The most commonly used measures for feed efficiency are feed conversion ratio (FCR) and residual feed intake (RFI). As a more suitable indicator assessing feed efficiency, RFI is defined as the difference between observed and expected feed intake based on maintenance and growth or laying. However, the genetic and biological mechanisms regulating RFI are largely unknown. Identifying molecular mechanisms explaining divergence in RFI in laying ducks would lead to the development of early detection methods for the selection of more efficient breeding poultry. The objective of this study was to identify duodenum genes and pathways through transcriptional profiling in 2 extreme RFI phenotypes (HRFI and LRFI) of the duck population. Phenotypic aspects of feed efficiency showed that RFI was strongly positive with FCR and feed intake (FI). Transcriptomic analysis identified 35 differentially expressed genes between LRFI and HRFI ducks. These genes play an important role in metabolism, digestibility, secretion, and innate immunity including (), (), (), β (), and (). These results improve our knowledge of the biological basis underlying RFI, which would be useful for further investigations of key candidate genes for RFI and for the development of biomarkers.

  13. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo

    2014-01-01

    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  14. Matching biomedical ontologies based on formal concept analysis.

    Science.gov (United States)

    Zhao, Mengyi; Zhang, Songmao; Li, Weizhuo; Chen, Guowei

    2018-03-19

    demonstrates the effectiveness of FCA-Map and its competitiveness with the top-ranked systems. FCA-Map can achieve a better balance between precision and recall for large-scale domain ontologies through constructing multiple FCA structures, whereas it performs unsatisfactorily for smaller-sized ontologies with less lexical and semantic expressions. Compared with other FCA-based OM systems, the study in this paper is more comprehensive as an attempt to push the envelope of the Formal Concept Analysis formalism in ontology matching tasks. Five types of formal contexts are constructed incrementally, and their derived concept lattices are used to cluster the commonalities among classes at lexical and structural level, respectively. Experiments on large, real-world domain ontologies show promising results and reveal the power of FCA.

  15. Ontology: ambiguity and accuracy

    Directory of Open Access Journals (Sweden)

    Marcelo Schiessl

    2012-08-01

    Full Text Available Ambiguity is a major obstacle to information retrieval. It is source of several researches in Information Science. Ontologies have been studied in order to solve problems related to ambiguities. Paradoxically, “ontology” term is also ambiguous and it is understood according to the use by the community. Philosophy and Computer Science seems to have the most accentuated difference related to the term sense. The former holds undisputed tradition and authority. The latter, in despite of being quite recent, holds an informal sense, but pragmatic. Information Science acts ranging from philosophical to computational approaches so as to get organized collections based on balance between users’ necessities and available information. The semantic web requires informational cycle automation and demands studies related to ontologies. Consequently, revisiting relevant approaches for the study of ontologies plays a relevant role as a way to provide useful ideas to researchers maintaining philosophical rigor, and convenience provided by computers.

  16. Ontological engineering versus metaphysics

    Science.gov (United States)

    Tataj, Emanuel; Tomanek, Roman; Mulawka, Jan

    2011-10-01

    It has been recognized that ontologies are a semantic version of world wide web and can be found in knowledge-based systems. A recent time survey of this field also suggest that practical artificial intelligence systems may be motivated by this research. Especially strong artificial intelligence as well as concept of homo computer can also benefit from their use. The main objective of this contribution is to present and review already created ontologies and identify the main advantages which derive such approach for knowledge management systems. We would like to present what ontological engineering borrows from metaphysics and what a feedback it can provide to natural language processing, simulations and modelling. The potential topics of further development from philosophical point of view is also underlined.

  17. Gene response profiles for Daphnia pulex exposed to the environmental stressor cadmium reveals novel crustacean metallothioneins

    Directory of Open Access Journals (Sweden)

    Davey Jennifer C

    2007-12-01

    Full Text Available Abstract Background Genomic research tools such as microarrays are proving to be important resources to study the complex regulation of genes that respond to environmental perturbations. A first generation cDNA microarray was developed for the environmental indicator species Daphnia pulex, to identify genes whose regulation is modulated following exposure to the metal stressor cadmium. Our experiments revealed interesting changes in gene transcription that suggest their biological roles and their potentially toxicological features in responding to this important environmental contaminant. Results Our microarray identified genes reported in the literature to be regulated in response to cadmium exposure, suggested functional attributes for genes that share no sequence similarity to proteins in the public databases, and pointed to genes that are likely members of expanded gene families in the Daphnia genome. Genes identified on the microarray also were associated with cadmium induced phenotypes and population-level outcomes that we experimentally determined. A subset of genes regulated in response to cadmium exposure was independently validated using quantitative-realtime (Q-RT-PCR. These microarray studies led to the discovery of three genes coding for the metal detoxication protein metallothionein (MT. The gene structures and predicted translated sequences of D. pulex MTs clearly place them in this gene family. Yet, they share little homology with previously characterized MTs. Conclusion The genomic information obtained from this study represents an important first step in characterizing microarray patterns that may be diagnostic to specific environmental contaminants and give insights into their toxicological mechanisms, while also providing a practical tool for evolutionary, ecological, and toxicological functional gene discovery studies. Advances in Daphnia genomics will enable the further development of this species as a model organism for

  18. Chemical Genomic Screening of a Saccharomyces cerevisiae Genomewide Mutant Collection Reveals Genes Required for Defense against Four Antimicrobial Peptides Derived from Proteins Found in Human Saliva

    Science.gov (United States)

    Bhatt, Sanjay; Schoenly, Nathan E.; Lee, Anna Y.; Nislow, Corey; Bobek, Libuse A.

    2013-01-01

    To compare the effects of four antimicrobial peptides (MUC7 12-mer, histatin 12-mer, cathelicidin KR20, and a peptide containing lactoferricin amino acids 1 to 11) on the yeast Saccharomyces cerevisiae, we employed a genomewide fitness screen of combined collections of mutants with homozygous deletions of nonessential genes and heterozygous deletions of essential genes. When an arbitrary fitness score cutoffs of 1 (indicating a fitness defect, or hypersensitivity) and −1 (indicating a fitness gain, or resistance) was used, 425 of the 5,902 mutants tested exhibited altered fitness when treated with at least one peptide. Functional analysis of the 425 strains revealed enrichment among the identified deletions in gene groups associated with the Gene Ontology (GO) terms “ribosomal subunit,” “ribosome biogenesis,” “protein glycosylation,” “vacuolar transport,” “Golgi vesicle transport,” “negative regulation of transcription,” and others. Fitness profiles of all four tested peptides were highly similar, particularly among mutant strains exhibiting the greatest fitness defects. The latter group included deletions in several genes involved in induction of the RIM101 signaling pathway, including several components of the ESCRT sorting machinery. The RIM101 signaling regulates response of yeasts to alkaline and neutral pH and high salts, and our data indicate that this pathway also plays a prominent role in regulating protective measures against all four tested peptides. In summary, the results of the chemical genomic screens of S. cerevisiae mutant collection suggest that the four antimicrobial peptides, despite their differences in structure and physical properties, share many interactions with S. cerevisiae cells and consequently a high degree of similarity between their modes of action. PMID:23208710

  19. Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses.

    Science.gov (United States)

    Rutter, William B; Salcedo, Andres; Akhunova, Alina; He, Fei; Wang, Shichen; Liang, Hanquan; Bowden, Robert L; Akhunov, Eduard

    2017-04-12

    Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt). The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors. Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the "arms-race" between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen

  20. A Method for Building Personalized Ontology Summaries

    OpenAIRE

    Queiroz-Sousa, Paulo Orlando; Salgado, Ana Carolina; Pires, Carlos Eduardo

    2013-01-01

    In the context of ontology engineering, the ontology understanding is the basis for its further developmentand reuse. One intuitive eective approach to support ontology understanding is the process of ontology summarizationwhich highlights the most important concepts of an ontology. Ontology summarization identies an excerpt from anontology that contains the most relevant concepts and produces an abridged ontology. In this article, we present amethod for summarizing ontologies that represent ...

  1. Ontology and medical diagnosis.

    Science.gov (United States)

    Bertaud-Gounot, Valérie; Duvauferrier, Régis; Burgun, Anita

    2012-03-01

    Ontology and associated generic tools are appropriate for knowledge modeling and reasoning, but most of the time, disease definitions in existing description logic (DL) ontology are not sufficient to classify patient's characteristics under a particular disease because they do not formalize operational definitions of diseases (association of signs and symptoms=diagnostic criteria). The main objective of this study is to propose an ontological representation which takes into account the diagnostic criteria on which specific patient conditions may be classified under a specific disease. This method needs as a prerequisite a clear list of necessary and sufficient diagnostic criteria as defined for lots of diseases by learned societies. It does not include probability/uncertainty which Web Ontology Language (OWL 2.0) cannot handle. We illustrate it with spondyloarthritis (SpA). Ontology has been designed in Protégé 4.1 OWL-DL2.0. Several kinds of criteria were formalized: (1) mandatory criteria, (2) picking two criteria among several diagnostic criteria, (3) numeric criteria. Thirty real patient cases were successfully classified with the reasoner. This study shows that it is possible to represent operational definitions of diseases with OWL and successfully classify real patient cases. Representing diagnostic criteria as descriptive knowledge (instead of rules in Semantic Web Rule Language or Prolog) allows us to take advantage of tools already available for OWL. While we focused on Assessment of SpondyloArthritis international Society SpA criteria, we believe that many of the representation issues addressed here are relevant to using OWL-DL for operational definition of other diseases in ontology.

  2. Mapping of gene expression reveals CYP27A1 as a susceptibility gene for sporadic ALS.

    Directory of Open Access Journals (Sweden)

    Frank P Diekstra

    Full Text Available Amyotrophic lateral sclerosis (ALS is a progressive, neurodegenerative disease characterized by loss of upper and lower motor neurons. ALS is considered to be a complex trait and genome-wide association studies (GWAS have implicated a few susceptibility loci. However, many more causal loci remain to be discovered. Since it has been shown that genetic variants associated with complex traits are more likely to be eQTLs than frequency-matched variants from GWAS platforms, we conducted a two-stage genome-wide screening for eQTLs associated with ALS. In addition, we applied an eQTL analysis to finemap association loci. Expression profiles using peripheral blood of 323 sporadic ALS patients and 413 controls were mapped to genome-wide genotyping data. Subsequently, data from a two-stage GWAS (3,568 patients and 10,163 controls were used to prioritize eQTLs identified in the first stage (162 ALS, 207 controls. These prioritized eQTLs were carried forward to the second sample with both gene-expression and genotyping data (161 ALS, 206 controls. Replicated eQTL SNPs were then tested for association in the second-stage GWAS data to find SNPs associated with disease, that survived correction for multiple testing. We thus identified twelve cis eQTLs with nominally significant associations in the second-stage GWAS data. Eight SNP-transcript pairs of highest significance (lowest p = 1.27 × 10(-51 withstood multiple-testing correction in the second stage and modulated CYP27A1 gene expression. Additionally, we show that C9orf72 appears to be the only gene in the 9p21.2 locus that is regulated in cis, showing the potential of this approach in identifying causative genes in association loci in ALS. This study has identified candidate genes for sporadic ALS, most notably CYP27A1. Mutations in CYP27A1 are causal to cerebrotendinous xanthomatosis which can present as a clinical mimic of ALS with progressive upper motor neuron loss, making it a plausible

  3. Core Semantics for Public Ontologies

    National Research Council Canada - National Science Library

    Suni, Niranjan

    2005-01-01

    ... (schemas or ontologies) with respect to objects. The DARPA Agent Markup Language (DAML) through the use of ontologies provides a very powerful way to describe objects and their relationships to other objects...

  4. Male-biased genes in catfish as revealed by RNA-Seq analysis of the testis transcriptome.

    Directory of Open Access Journals (Sweden)

    Fanyue Sun

    Full Text Available BACKGROUND: Catfish has a male-heterogametic (XY sex determination system, but genes involved in gonadogenesis, spermatogenesis, testicular determination, and sex determination are poorly understood. As a first step of understanding the transcriptome of the testis, here, we conducted RNA-Seq analysis using high throughput Illumina sequencing. METHODOLOGY/PRINCIPAL FINDINGS: A total of 269.6 million high quality reads were assembled into 193,462 contigs with a N50 length of 806 bp. Of these contigs, 67,923 contigs had hits to a set of 25,307 unigenes, including 167 unique genes that had not been previously identified in catfish. A meta-analysis of expressed genes in the testis and in the gynogen (double haploid female allowed the identification of 5,450 genes that are preferentially expressed in the testis, providing a pool of putative male-biased genes. Gene ontology and annotation analysis suggested that many of these male-biased genes were involved in gonadogenesis, spermatogenesis, testicular determination, gametogenesis, gonad differentiation, and possibly sex determination. CONCLUSION/SIGNIFICANCE: We provide the first transcriptome-level analysis of the catfish testis. Our analysis would lay the basis for sequential follow-up studies of genes involved in sex determination and differentiation in catfish.

  5. Gene expression profiles of prostate cancer reveal involvement of multiple molecular pathways in the metastatic process

    International Nuclear Information System (INIS)

    Chandran, Uma R; Ma, Changqing; Dhir, Rajiv; Bisceglia, Michelle; Lyons-Weiler, Maureen; Liang, Wenjing; Michalopoulos, George; Becich, Michael; Monzon, Federico A

    2007-01-01

    Prostate cancer is characterized by heterogeneity in the clinical course that often does not correlate with morphologic features of the tumor. Metastasis reflects the most adverse outcome of prostate cancer, and to date there are no reliable morphologic features or serum biomarkers that can reliably predict which patients are at higher risk of developing metastatic disease. Understanding the differences in the biology of metastatic and organ confined primary tumors is essential for developing new prognostic markers and therapeutic targets. Using Affymetrix oligonucleotide arrays, we analyzed gene expression profiles of 24 androgen-ablation resistant metastatic samples obtained from 4 patients and a previously published dataset of 64 primary prostate tumor samples. Differential gene expression was analyzed after removing potentially uninformative stromal genes, addressing the differences in cellular content between primary and metastatic tumors. The metastatic samples are highly heterogenous in expression; however, differential expression analysis shows that 415 genes are upregulated and 364 genes are downregulated at least 2 fold in every patient with metastasis. The expression profile of metastatic samples reveals changes in expression of a unique set of genes representing both the androgen ablation related pathways and other metastasis related gene networks such as cell adhesion, bone remodelling and cell cycle. The differentially expressed genes include metabolic enzymes, transcription factors such as Forkhead Box M1 (FoxM1) and cell adhesion molecules such as Osteopontin (SPP1). We hypothesize that these genes have a role in the biology of metastatic disease and that they represent potential therapeutic targets for prostate cancer

  6. DNA microarray revealed and RNAi plants confirmed key genes conferring low Cd accumulation in barley grains

    DEFF Research Database (Denmark)

    Sun, Hongyan; Chen, Zhong-Hua; Chen, Fei

    2015-01-01

    Background Understanding the mechanism of low Cd accumulation in crops is crucial for sustainable safe food production in Cd-contaminated soils. Results Confocal microscopy, atomic absorption spectrometry, gas exchange and chlorophyll fluorescence analyses revealed a distinct difference in Cd...... with a substantial difference between the two genotypes. Cd stress led to higher expression of genes involved in transport, carbohydrate metabolism and signal transduction in the low-grain-Cd-accumulating genotype. Novel transporter genes such as zinc transporter genes were identified as being associated with low Cd...... accumulation. Quantitative RT-PCR confirmed our microarray data. Furthermore, suppression of the zinc transporter genes HvZIP3 and HvZIP8 by RNAi silencing showed increased Cd accumulation and reduced Zn and Mn concentrations in barley grains. Thus, HvZIP3 and HvZIP8 could be candidate genes related to low...

  7. Learning expressive ontologies

    CERN Document Server

    Völker, J

    2009-01-01

    This publication advances the state-of-the-art in ontology learning by presenting a set of novel approaches to the semi-automatic acquisition, refinement and evaluation of logically complex axiomatizations. It has been motivated by the fact that the realization of the semantic web envisioned by Tim Berners-Lee is still hampered by the lack of ontological resources, while at the same time more and more applications of semantic technologies emerge from fast-growing areas such as e-business or life sciences. Such knowledge-intensive applications, requiring large scale reasoning over complex domai

  8. ONTOLOGY IN PHARMACY

    Directory of Open Access Journals (Sweden)

    L. Yu. Babintseva

    2015-05-01

    Full Text Available It’s considered ontological models for formalization of knowledge in pharmacy. There is emphasized the view that the possibility of rapid exchange of information in the pharmaceutical industry, it is necessary to create a single information space. This means not only the establishment of uniform standards for the presentation of information on pharmaceutical groups pharmacotherapeutic classifications, but also the creation of a unified and standardized system for the transfer and renewal of knowledge. It is the organization of information in the ontology helps quickly in the future to build expert systems and applications to work with data.

  9. Summarization by domain ontology navigation

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik

    2013-01-01

    of the subject. In between these two extremes, conceptual summaries encompass selected concepts derived using background knowledge. We address in this paper an approach where conceptual summaries are provided through a conceptualization as given by an ontology. The ontology guiding the summarization can...... be a simple taxonomy or a generative domain ontology. A domain ontology can be provided by a preanalysis of a domain corpus and can be used to condense improved summaries that better reflects the conceptualization of a given domain....

  10. Gene Profiling in Patients with Systemic Sclerosis Reveals the Presence of Oncogenic Gene Signatures

    Directory of Open Access Journals (Sweden)

    Marzia Dolcino

    2018-03-01

    Full Text Available Systemic sclerosis (SSc is a rare connective tissue disease characterized by three pathogenetic hallmarks: vasculopathy, dysregulation of the immune system, and fibrosis. A particular feature of SSc is the increased frequency of some types of malignancies, namely breast, lung, and hematological malignancies. Moreover, SSc may also be a paraneoplastic disease, again indicating a strong link between cancer and scleroderma. The reason of this association is still unknown; therefore, we aimed at investigating whether particular genetic or epigenetic factors may play a role in promoting cancer development in patients with SSc and whether some features are shared by the two conditions. We therefore performed a gene expression profiling of peripheral blood mononuclear cells (PBMCs derived from patients with limited and diffuse SSc, showing that the various classes of genes potentially linked to the pathogenesis of SSc (such as apoptosis, endothelial cell activation, extracellular matrix remodeling, immune response, and inflammation include genes that directly participate in the development of malignancies or that are involved in pathways known to be associated with carcinogenesis. The transcriptional analysis was then complemented by a complex network analysis of modulated genes which further confirmed the presence of signaling pathways associated with carcinogenesis. Since epigenetic mechanisms, such as microRNAs (miRNAs, are believed to play a central role in the pathogenesis of SSc, we also evaluated whether specific cancer-related miRNAs could be deregulated in the serum of SSc patients. We focused our attention on miRNAs already found upregulated in SSc such as miR-21-5p, miR-92a-3p, and on miR-155-5p, miR 126-3p and miR-16-5p known to be deregulated in malignancies associated to SSc, i.e., breast, lung, and hematological malignancies. miR-21-5p, miR-92a-3p, miR-155-5p, and miR-16-5p expression was significantly higher in SSc sera compared to

  11. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk.

    Science.gov (United States)

    Cheng, Liang; Jiang, Yue; Ju, Hong; Sun, Jie; Peng, Jiajie; Zhou, Meng; Hu, Yang

    2018-01-19

    Since the establishment of the first biomedical ontology Gene Ontology (GO), the number of biomedical ontology has increased dramatically. Nowadays over 300 ontologies have been built including extensively used Disease Ontology (DO) and Human Phenotype Ontology (HPO). Because of the advantage of identifying novel relationships between terms, calculating similarity between ontology terms is one of the major tasks in this research area. Though similarities between terms within each ontology have been studied with in silico methods, term similarities across different ontologies were not investigated as deeply. The latest method took advantage of gene functional interaction network (GFIN) to explore such inter-ontology similarities of terms. However, it only used gene interactions and failed to make full use of the connectivity among gene nodes of the network. In addition, all existent methods are particularly designed for GO and their performances on the extended ontology community remain unknown. We proposed a method InfAcrOnt to infer similarities between terms across ontologies utilizing the entire GFIN. InfAcrOnt builds a term-gene-gene network which comprised ontology annotations and GFIN, and acquires similarities between terms across ontologies through modeling the information flow within the network by random walk. In our benchmark experiments on sub-ontologies of GO, InfAcrOnt achieves a high average area under the receiver operating characteristic curve (AUC) (0.9322 and 0.9309) and low standard deviations (1.8746e-6 and 3.0977e-6) in both human and yeast benchmark datasets exhibiting superior performance. Meanwhile, comparisons of InfAcrOnt results and prior knowledge on pair-wise DO-HPO terms and pair-wise DO-GO terms show high correlations. The experiment results show that InfAcrOnt significantly improves the performance of inferring similarities between terms across ontologies in benchmark set.

  12. RNA-Seq analysis during the life cycle of Cryptosporidium parvum reveals significant differential gene expression between proliferating stages in the intestine and infectious sporozoites.

    Science.gov (United States)

    Lippuner, Christoph; Ramakrishnan, Chandra; Basso, Walter U; Schmid, Marc W; Okoniewski, Michal; Smith, Nicholas C; Hässig, Michael; Deplazes, Peter; Hehl, Adrian B

    2018-05-01

    Cryptosporidium parvum is a major cause of diarrhoea in humans and animals. There are no vaccines and few drugs available to control C. parvum. In this study, we used RNA-Seq to compare gene expression in sporozoites and intracellular stages of C. parvum to identify genes likely to be important for successful completion of the parasite's life cycle and, thereby, possible targets for drugs or vaccines. We identified 3774 protein-encoding transcripts in C. parvum. Applying a stringent cut-off of eight fold for determination of differential expression, we identified 173 genes (26 coding for predicted secreted proteins) upregulated in sporozoites. On the other hand, expression of 1259 genes was upregulated in intestinal stages (merozoites/gamonts) with a gene ontology enrichment for 63 biological processes and upregulation of 117 genes in 23 metabolic pathways. There was no clear stage specificity of expression of AP2-domain containing transcription factors, although sporozoites had a relatively small repertoire of these important regulators. Our RNA-Seq analysis revealed a new calcium-dependent protein kinase, bringing the total number of known calcium-dependent protein kinases (CDPKs) in C. parvum to 11. One of these, CDPK1, was expressed in all stages, strengthening the notion that it is a valid drug target. By comparing parasites grown in vivo (which produce bona fide thick-walled oocysts) and in vitro (which are arrested in sexual development prior to oocyst generation) we were able to confirm that genes encoding oocyst wall proteins are expressed in gametocytes and that the proteins are stockpiled rather than generated de novo in zygotes. RNA-Seq analysis of C. parvum revealed genes expressed in a stage-specific manner and others whose expression is required at all stages of development. The functional significance of these can now be addressed through recent advances in transgenics for C. parvum, and may lead to the identification of viable drug and vaccine

  13. Bioinformatics analysis of RNA-seq data revealed critical genes in colon adenocarcinoma.

    Science.gov (United States)

    Xi, W-D; Liu, Y-J; Sun, X-B; Shan, J; Yi, L; Zhang, T-T

    2017-07-01

    RNA-seq data of colon adenocarcinoma (COAD) were analyzed with bioinformatics tools to discover critical genes in the disease. Relevant small molecule drugs, transcription factors (TFs) and microRNAs (miRNAs) were also investigated. RNA-seq data of COAD were downloaded from The Cancer Genome Atlas (TCGA). Differential analysis was performed with package edgeR. False positive discovery (FDR) 1 were set as the cut-offs to screen out differentially expressed genes (DEGs). Gene coexpression network was constructed with package Ebcoexpress. GO enrichment analysis was performed for the DEGs in the gene coexpression network with DAVID. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was also performed for the genes with KOBASS 2.0. Modules were identified with MCODE of Cytoscape. Relevant small molecules drugs were predicted by Connectivity map. Relevant miRNAs and TFs were searched by WebGestalt. A total of 457 DEGs, including 255 up-regulated and 202 down-regulated genes, were identified from 437 COAD and 39 control samples. A gene coexpression network was constructed containing 40 DEGs and 101 edges. The genes were mainly associated with collagen fibril organization, extracellular matrix organization and translation. Two modules were identified from the gene coexpression network, which were implicated in muscle contraction and extracellular matrix organization, respectively. Several critical genes were disclosed, such as MYH11, COL5A2 and ribosomal proteins. Nine relevant small molecule drugs were identified, such as scriptaid and STOCK1N-35874. Accordingly, a total of 17 TFs and 10 miRNAs related to COAD were acquired, such as ETS2, NFAT, AP4, miR-124A, MiR-9, miR-96 and let-7. Several critical genes and relevant drugs, TFs and miRNAs were revealed in COAD. These findings could advance the understanding of the disease and benefit therapy development.

  14. Biomedical ontologies: toward scientific debate.

    Science.gov (United States)

    Maojo, V; Crespo, J; García-Remesal, M; de la Iglesia, D; Perez-Rey, D; Kulikowski, C

    2011-01-01

    Biomedical ontologies have been very successful in structuring knowledge for many different applications, receiving widespread praise for their utility and potential. Yet, the role of computational ontologies in scientific research, as opposed to knowledge management applications, has not been extensively discussed. We aim to stimulate further discussion on the advantages and challenges presented by biomedical ontologies from a scientific perspective. We review various aspects of biomedical ontologies going beyond their practical successes, and focus on some key scientific questions in two ways. First, we analyze and discuss current approaches to improve biomedical ontologies that are based largely on classical, Aristotelian ontological models of reality. Second, we raise various open questions about biomedical ontologies that require further research, analyzing in more detail those related to visual reasoning and spatial ontologies. We outline significant scientific issues that biomedical ontologies should consider, beyond current efforts of building practical consensus between them. For spatial ontologies, we suggest an approach for building "morphospatial" taxonomies, as an example that could stimulate research on fundamental open issues for biomedical ontologies. Analysis of a large number of problems with biomedical ontologies suggests that the field is very much open to alternative interpretations of current work, and in need of scientific debate and discussion that can lead to new ideas and research directions.

  15. Using a Foundational Ontology for Reengineering a Software Enterprise Ontology

    Science.gov (United States)

    Perini Barcellos, Monalessa; de Almeida Falbo, Ricardo

    The knowledge about software organizations is considerably relevant to software engineers. The use of a common vocabulary for representing the useful knowledge about software organizations involved in software projects is important for several reasons, such as to support knowledge reuse and to allow communication and interoperability between tools. Domain ontologies can be used to define a common vocabulary for sharing and reuse of knowledge about some domain. Foundational ontologies can be used for evaluating and re-designing domain ontologies, giving to these real-world semantics. This paper presents an evaluating of a Software Enterprise Ontology that was reengineered using the Unified Foundation Ontology (UFO) as basis.

  16. Gain and loss of phototrophic genes revealed by comparison of two Citromicrobium bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Qiang Zheng

    Full Text Available Proteobacteria are thought to have diverged from a phototrophic ancestor, according to the scattered distribution of phototrophy throughout the proteobacterial clade, and so the occurrence of numerous closely related phototrophic and chemotrophic microorganisms may be the result of the loss of genes for phototrophy. A widespread form of bacterial phototrophy is based on the photochemical reaction center, encoded by puf and puh operons that typically are in a 'photosynthesis gene cluster' (abbreviated as the PGC with pigment biosynthesis genes. Comparison of two closely related Citromicrobial genomes (98.1% sequence identity of complete 16S rRNA genes, Citromicrobium sp. JL354, which contains two copies of reaction center genes, and Citromicrobium strain JLT1363, which is chemotrophic, revealed evidence for the loss of phototrophic genes. However, evidence of horizontal gene transfer was found in these two bacterial genomes. An incomplete PGC (pufLMC-puhCBA in strain JL354 was located within an integrating conjugative element, which indicates a potential mechanism for the horizontal transfer of genes for phototrophy.

  17. The design ontology

    DEFF Research Database (Denmark)

    Storga, Mario; Andreasen, Mogens Myrup; Marjanovic, Dorian

    2010-01-01

    The article presents the research of the nature, building and practical role of a Design Ontology as a potential framework for the more efficient product development (PD) data-, information- and knowledge- description, -explanation, -understanding and -reusing. In the methodology for development ...

  18. Dahlbeck and Pure Ontology

    Science.gov (United States)

    Mackenzie, Jim

    2016-01-01

    This article responds to Johan Dahlbeck's "Towards a pure ontology: Children's bodies and morality" ["Educational Philosophy and Theory," vol. 46 (1), 2014, pp. 8-23 (EJ1026561)]. His arguments from Nietzsche and Spinoza do not carry the weight he supposes, and the conclusions he draws from them about pedagogy would be…

  19. Audit Validation Using Ontologies

    Directory of Open Access Journals (Sweden)

    Ion IVAN

    2015-01-01

    Full Text Available Requirements to increase quality audit processes in enterprises are defined. It substantiates the need for assessment and management audit processes using ontologies. Sets of rules, ways to assess the consistency of rules and behavior within the organization are defined. Using ontologies are obtained qualifications that assess the organization's audit. Elaboration of the audit reports is a perfect algorithm-based activity characterized by generality, determinism, reproducibility, accuracy and a well-established. The auditors obtain effective levels. Through ontologies obtain the audit calculated level. Because the audit report is qualitative structure of information and knowledge it is very hard to analyze and interpret by different groups of users (shareholders, managers or stakeholders. Developing ontology for audit reports validation will be a useful instrument for both auditors and report users. In this paper we propose an instrument for validation of audit reports contain a lot of keywords that calculates indicators, a lot of indicators for each key word there is an indicator, qualitative levels; interpreter who builds a table of indicators, levels of actual and calculated levels.

  20. Biomedicine: an ontological dissection.

    Science.gov (United States)

    Baronov, David

    2008-01-01

    Though ubiquitous across the medical social sciences literature, the term "biomedicine" as an analytical concept remains remarkably slippery. It is argued here that this imprecision is due in part to the fact that biomedicine is comprised of three interrelated ontological spheres, each of which frames biomedicine as a distinct subject of investigation. This suggests that, depending upon one's ontological commitment, the meaning of biomedicine will shift. From an empirical perspective, biomedicine takes on the appearance of a scientific enterprise and is defined as a derivative category of Western science more generally. From an interpretive perspective, biomedicine represents a symbolic-cultural expression whose adherence to the principles of scientific objectivity conceals an ideological agenda. From a conceptual perspective, biomedicine represents an expression of social power that reflects structures of power and privilege within capitalist society. No one perspective exists in isolation and so the image of biomedicine from any one presents an incomplete understanding. It is the mutually-conditioning interrelations between these ontological spheres that account for biomedicine's ongoing development. Thus, the ontological dissection of biomedicine that follows, with particular emphasis on the period of its formal crystallization in the latter nineteenth and early twentieth century, is intended to deepen our understanding of biomedicine as an analytical concept across the medical social sciences literature.

  1. Transcriptome analysis by GeneTrail revealed regulation of functional categories in response to alterations of iron homeostasis in Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Lenhof Hans-Peter

    2011-05-01

    Full Text Available Abstract Background High-throughput technologies have opened new avenues to study biological processes and pathways. The interpretation of the immense amount of data sets generated nowadays needs to be facilitated in order to enable biologists to identify complex gene networks and functional pathways. To cope with this task multiple computer-based programs have been developed. GeneTrail is a freely available online tool that screens comparative transcriptomic data for differentially regulated functional categories and biological pathways extracted from common data bases like KEGG, Gene Ontology (GO, TRANSPATH and TRANSFAC. Additionally, GeneTrail offers a feature that allows screening of individually defined biological categories that are relevant for the respective research topic. Results We have set up GeneTrail for the use of Arabidopsis thaliana. To test the functionality of this tool for plant analysis, we generated transcriptome data of root and leaf responses to Fe deficiency and the Arabidopsis metal homeostasis mutant nas4x-1. We performed Gene Set Enrichment Analysis (GSEA with eight meaningful pairwise comparisons of transcriptome data sets. We were able to uncover several functional pathways including metal homeostasis that were affected in our experimental situations. Representation of the differentially regulated functional categories in Venn diagrams uncovered regulatory networks at the level of whole functional pathways. Over-Representation Analysis (ORA of differentially regulated genes identified in pairwise comparisons revealed specific functional plant physiological categories as major targets upon Fe deficiency and in nas4x-1. Conclusion Here, we obtained supporting evidence, that the nas4x-1 mutant was defective in metal homeostasis. It was confirmed that nas4x-1 showed Fe deficiency in roots and signs of Fe deficiency and Fe sufficiency in leaves. Besides metal homeostasis, biotic stress, root carbohydrate, leaf

  2. Epistemology and ontology in core ontologies: FOLaw and LRI-Core, two core ontologies for law

    NARCIS (Netherlands)

    Breukers, J.A.P.J.; Hoekstra, R.J.

    2004-01-01

    For more than a decade constructing ontologies for legal domains, we, at the Leibniz Center for Law, felt really the need to develop a core ontology for law that would enable us to re-use the common denominator of the various legal domains. In this paper we present two core ontologies for law. The

  3. Differential Gene Expression by Lactobacillus plantarum WCFS1 in Response to Phenolic Compounds Reveals New Genes Involved in Tannin Degradation.

    Science.gov (United States)

    Reverón, Inés; Jiménez, Natalia; Curiel, José Antonio; Peñas, Elena; López de Felipe, Félix; de Las Rivas, Blanca; Muñoz, Rosario

    2017-04-01

    Lactobacillus plantarum is a lactic acid bacterium that can degrade food tannins by the successive action of tannase and gallate decarboxylase enzymes. In the L. plantarum genome, the gene encoding the catalytic subunit of gallate decarboxylase ( lpdC , or lp_2945 ) is only 6.5 kb distant from the gene encoding inducible tannase ( L. plantarum tanB [ tanB Lp ], or lp_2956 ). This genomic context suggests concomitant activity and regulation of both enzymatic activities. Reverse transcription analysis revealed that subunits B ( lpdB , or lp_0271 ) and D ( lpdD , or lp_0272 ) of the gallate decarboxylase are cotranscribed, whereas subunit C ( lpdC , or lp_2945 ) is cotranscribed with a gene encoding a transport protein ( gacP , or lp_2943 ). In contrast, the tannase gene is transcribed as a monocistronic mRNA. Investigation of knockout mutations of genes located in this chromosomal region indicated that only mutants of the gallate decarboxylase (subunits B and C), tannase, GacP transport protein, and TanR transcriptional regulator ( lp_2942 ) genes exhibited altered tannin metabolism. The expression profile of genes involved in tannin metabolism was also analyzed in these mutants in the presence of methyl gallate and gallic acid. It is noteworthy that inactivation of tanR suppresses the induction of all genes overexpressed in the presence of methyl gallate and gallic acid. This transcriptional regulator was also induced in the presence of other phenolic compounds, such as kaempferol and myricetin. This study complements the catalog of L. plantarum expression profiles responsive to phenolic compounds, which enable this bacterium to adapt to a plant food environment. IMPORTANCE Lactobacillus plantarum is a bacterial species frequently found in the fermentation of vegetables when tannins are present. L. plantarum strains degrade tannins to the less-toxic pyrogallol by the successive action of tannase and gallate decarboxylase enzymes. The genes encoding these enzymes are

  4. [Towards a structuring fibrillar ontology].

    Science.gov (United States)

    Guimberteau, J-C

    2012-10-01

    Over previous decades and centuries, the difficulty encountered in the manner in which the tissue of our bodies is organised, and structured, is clearly explained by the impossibility of exploring it in detail. Since the creation of the microscope, the perception of the basic unity, which is the cell, has been essential in understanding the functioning of reproduction and of transmission, but has not been able to explain the notion of form; since the cells are not everywhere and are not distributed in an apparently balanced manner. The problems that remain are those of form and volume and also of connection. The concept of multifibrillar architecture, shaping the interfibrillar microvolumes in space, represents a solution to all these questions. The architectural structures revealed, made up of fibres, fibrils and microfibrils, from the mesoscopic to the microscopic level, provide the concept of a living form with structural rationalism that permits the association of psychochemical molecular biodynamics and quantum physics: the form can thus be described and interpreted, and a true structural ontology is elaborated from a basic functional unity, which is the microvacuole, the intra and interfibrillar volume of the fractal organisation, and the chaotic distribution. Naturally, new, less linear, less conclusive, and less specific concepts will be implied by this ontology, leading one to believe that the emergence of life takes place under submission to forces that the original form will have imposed and oriented the adaptive finality. Copyright © 2012. Published by Elsevier SAS.

  5. Development of an Ontology for Periodontitis.

    Science.gov (United States)

    Suzuki, Asami; Takai-Igarashi, Takako; Nakaya, Jun; Tanaka, Hiroshi

    2015-01-01

    In the clinical dentists and periodontal researchers' community, there is an obvious demand for a systems model capable of linking the clinical presentation of periodontitis to underlying molecular knowledge. A computer-readable representation of processes on disease development will give periodontal researchers opportunities to elucidate pathways and mechanisms of periodontitis. An ontology for periodontitis can be a model for integration of large variety of factors relating to a complex disease such as chronic inflammation in different organs accompanied by bone remodeling and immune system disorders, which has recently been referred to as osteoimmunology. Terms characteristic of descriptions related to the onset and progression of periodontitis were manually extracted from 194 review articles and PubMed abstracts by experts in periodontology. We specified all the relations between the extracted terms and constructed them into an ontology for periodontitis. We also investigated matching between classes of our ontology and that of Gene Ontology Biological Process. We developed an ontology for periodontitis called Periodontitis-Ontology (PeriO). The pathological progression of periodontitis is caused by complex, multi-factor interrelationships. PeriO consists of all the required concepts to represent the pathological progression and clinical treatment of periodontitis. The pathological processes were formalized with reference to Basic Formal Ontology and Relation Ontology, which accounts for participants in the processes realized by biological objects such as molecules and cells. We investigated the peculiarity of biological processes observed in pathological progression and medical treatments for the disease in comparison with Gene Ontology Biological Process (GO-BP) annotations. The results indicated that peculiarities of Perio existed in 1) granularity and context dependency of both the conceptualizations, and 2) causality intrinsic to the pathological processes

  6. Benchmarking ontologies: bigger or better?

    Directory of Open Access Journals (Sweden)

    Lixia Yao

    2011-01-01

    Full Text Available A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1 four of the most common medical ontologies with respect to a corpus of medical documents and (2 seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them.

  7. DNA entropy reveals a significant difference in complexity between housekeeping and tissue specific gene promoters.

    Science.gov (United States)

    Thomas, David; Finan, Chris; Newport, Melanie J; Jones, Susan

    2015-10-01

    The complexity of DNA can be quantified using estimates of entropy. Variation in DNA complexity is expected between the promoters of genes with different transcriptional mechanisms; namely housekeeping (HK) and tissue specific (TS). The former are transcribed constitutively to maintain general cellular functions, and the latter are transcribed in restricted tissue and cells types for specific molecular events. It is known that promoter features in the human genome are related to tissue specificity, but this has been difficult to quantify on a genomic scale. If entropy effectively quantifies DNA complexity, calculating the entropies of HK and TS gene promoters as profiles may reveal significant differences. Entropy profiles were calculated for a total dataset of 12,003 human gene promoters and for 501 housekeeping (HK) and 587 tissue specific (TS) human gene promoters. The mean profiles show the TS promoters have a significantly lower entropy (pentropy distributions for the 3 datasets show that promoter entropies could be used to identify novel HK genes. Functional features comprise DNA sequence patterns that are non-random and hence they have lower entropies. The lower entropy of TS gene promoters can be explained by a higher density of positive and negative regulatory elements, required for genes with complex spatial and temporary expression. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Characterization of the avian Trojan gene family reveals contrasting evolutionary constraints.

    Directory of Open Access Journals (Sweden)

    Petar Petrov

    Full Text Available "Trojan" is a leukocyte-specific, cell surface protein originally identified in the chicken. Its molecular function has been hypothesized to be related to anti-apoptosis and the proliferation of immune cells. The Trojan gene has been localized onto the Z sex chromosome. The adjacent two genes also show significant homology to Trojan, suggesting the existence of a novel gene/protein family. Here, we characterize this Trojan family, identify homologues in other species and predict evolutionary constraints on these genes. The two Trojan-related proteins in chicken were predicted as a receptor-type tyrosine phosphatase and a transmembrane protein, bearing a cytoplasmic immuno-receptor tyrosine-based activation motif. We identified the Trojan gene family in ten other bird species and found related genes in three reptiles and a fish species. The phylogenetic analysis of the homologues revealed a gradual diversification among the family members. Evolutionary analyzes of the avian genes predicted that the extracellular regions of the proteins have been subjected to positive selection. Such selection was possibly a response to evolving interacting partners or to pathogen challenges. We also observed an almost complete lack of intracellular positively selected sites, suggesting a conserved signaling mechanism of the molecules. Therefore, the contrasting patterns of selection likely correlate with the interaction and signaling potential of the molecules.

  9. Characterization of the avian Trojan gene family reveals contrasting evolutionary constraints.

    Science.gov (United States)

    Petrov, Petar; Syrjänen, Riikka; Smith, Jacqueline; Gutowska, Maria Weronika; Uchida, Tatsuya; Vainio, Olli; Burt, David W

    2015-01-01

    "Trojan" is a leukocyte-specific, cell surface protein originally identified in the chicken. Its molecular function has been hypothesized to be related to anti-apoptosis and the proliferation of immune cells. The Trojan gene has been localized onto the Z sex chromosome. The adjacent two genes also show significant homology to Trojan, suggesting the existence of a novel gene/protein family. Here, we characterize this Trojan family, identify homologues in other species and predict evolutionary constraints on these genes. The two Trojan-related proteins in chicken were predicted as a receptor-type tyrosine phosphatase and a transmembrane protein, bearing a cytoplasmic immuno-receptor tyrosine-based activation motif. We identified the Trojan gene family in ten other bird species and found related genes in three reptiles and a fish species. The phylogenetic analysis of the homologues revealed a gradual diversification among the family members. Evolutionary analyzes of the avian genes predicted that the extracellular regions of the proteins have been subjected to positive selection. Such selection was possibly a response to evolving interacting partners or to pathogen challenges. We also observed an almost complete lack of intracellular positively selected sites, suggesting a conserved signaling mechanism of the molecules. Therefore, the contrasting patterns of selection likely correlate with the interaction and signaling potential of the molecules.

  10. Semantics in support of biodiversity knowledge discovery: an introduction to the biological collections ontology and related ontologies.

    Science.gov (United States)

    Walls, Ramona L; Deck, John; Guralnick, Robert; Baskauf, Steve; Beaman, Reed; Blum, Stanley; Bowers, Shawn; Buttigieg, Pier Luigi; Davies, Neil; Endresen, Dag; Gandolfo, Maria Alejandra; Hanner, Robert; Janning, Alyssa; Krishtalka, Leonard; Matsunaga, Andréa; Midford, Peter; Morrison, Norman; Ó Tuama, Éamonn; Schildhauer, Mark; Smith, Barry; Stucky, Brian J; Thomer, Andrea; Wieczorek, John; Whitacre, Jamie; Wooley, John

    2014-01-01

    The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers.

  11. Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies

    Science.gov (United States)

    Baskauf, Steve; Blum, Stanley; Bowers, Shawn; Davies, Neil; Endresen, Dag; Gandolfo, Maria Alejandra; Hanner, Robert; Janning, Alyssa; Krishtalka, Leonard; Matsunaga, Andréa; Midford, Peter; Tuama, Éamonn Ó.; Schildhauer, Mark; Smith, Barry; Stucky, Brian J.; Thomer, Andrea; Wieczorek, John; Whitacre, Jamie; Wooley, John

    2014-01-01

    The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers

  12. Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.

    Science.gov (United States)

    Lamy, Jean-Baptiste

    2017-07-01

    Ontologies are widely used in the biomedical domain. While many tools exist for the edition, alignment or evaluation of ontologies, few solutions have been proposed for ontology programming interface, i.e. for accessing and modifying an ontology within a programming language. Existing query languages (such as SPARQL) and APIs (such as OWLAPI) are not as easy-to-use as object programming languages are. Moreover, they provide few solutions to difficulties encountered with biomedical ontologies. Our objective was to design a tool for accessing easily the entities of an OWL ontology, with high-level constructs helping with biomedical ontologies. From our experience on medical ontologies, we identified two difficulties: (1) many entities are represented by classes (rather than individuals), but the existing tools do not permit manipulating classes as easily as individuals, (2) ontologies rely on the open-world assumption, whereas the medical reasoning must consider only evidence-based medical knowledge as true. We designed a Python module for ontology-oriented programming. It allows access to the entities of an OWL ontology as if they were objects in the programming language. We propose a simple high-level syntax for managing classes and the associated "role-filler" constraints. We also propose an algorithm for performing local closed world reasoning in simple situations. We developed Owlready, a Python module for a high-level access to OWL ontologies. The paper describes the architecture and the syntax of the module version 2. It details how we integrated the OWL ontology model with the Python object model. The paper provides examples based on Gene Ontology (GO). We also demonstrate the interest of Owlready in a use case focused on the automatic comparison of the contraindications of several drugs. This use case illustrates the use of the specific syntax proposed for manipulating classes and for performing local closed world reasoning. Owlready has been successfully

  13. Comprehensive gene expression profiling reveals synergistic functional networks in cerebral vessels after hypertension or hypercholesterolemia.

    Directory of Open Access Journals (Sweden)

    Wei-Yi Ong

    Full Text Available Atherosclerotic stenosis of cerebral arteries or intracranial large artery disease (ICLAD is a major cause of stroke especially in Asians, Hispanics and Africans, but relatively little is known about gene expression changes in vessels at risk. This study compares comprehensive gene expression profiles in the middle cerebral artery (MCA of New Zealand White rabbits exposed to two stroke risk factors i.e. hypertension and/or hypercholesterolemia, by the 2-Kidney-1-Clip method, or dietary supplementation with cholesterol. Microarray and Ingenuity Pathway Analyses of the MCA of the hypertensive rabbits showed up-regulated genes in networks containing the node molecules: UBC (ubiquitin, P38 MAPK, ERK, NFkB, SERPINB2, MMP1 and APP (amyloid precursor protein; and down-regulated genes related to MAPK, ERK 1/2, Akt, 26 s proteasome, histone H3 and UBC. The MCA of hypercholesterolemic rabbits showed differentially expressed genes that are surprisingly, linked to almost the same node molecules as the hypertensive rabbits, despite a relatively low percentage of 'common genes' (21 and 7% between the two conditions. Up-regulated common genes were related to: UBC, SERPINB2, TNF, HNF4A (hepatocyte nuclear factor 4A and APP, and down-regulated genes, related to UBC. Increased HNF4A message and protein were verified in the aorta. Together, these findings reveal similar nodal molecules and gene pathways in cerebral vessels affected by hypertension or hypercholesterolemia, which could be a basis for synergistic action of risk factors in the pathogenesis of ICLAD.

  14. Peripheral blood transcriptome sequencing reveals rejection-relevant genes in long-term heart transplantation.

    Science.gov (United States)

    Chen, Yan; Zhang, Haibo; Xiao, Xue; Jia, Yixin; Wu, Weili; Liu, Licheng; Jiang, Jun; Zhu, Baoli; Meng, Xu; Chen, Weijun

    2013-10-03

    Peripheral blood-based gene expression patterns have been investigated as biomarkers to monitor the immune system and rule out rejection after heart transplantation. Recent advances in the high-throughput deep sequencing (HTS) technologies provide new leads in transcriptome analysis. By performing Solexa/Illumina's digital gene expression (DGE) profiling, we analyzed gene expression profiles of PBMCs from 6 quiescent (grade 0) and 6 rejection (grade 2R&3R) heart transplant recipients at more than 6 months after transplantation. Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was carried out in an independent validation cohort of 47 individuals from three rejection groups (ISHLT, grade 0,1R, 2R&3R). Through DGE sequencing and qPCR validation, 10 genes were identified as informative genes for detection of cardiac transplant rejection. A further clustering analysis showed that the 10 genes were not only effective for distinguishing patients with acute cardiac allograft rejection, but also informative for discriminating patients with renal allograft rejection based on both blood and biopsy samples. Moreover, PPI network analysis revealed that the 10 genes were connected to each other within a short interaction distance. We proposed a 10-gene signature for heart transplant patients at high-risk of developing severe rejection, which was found to be effective as well in other organ transplant. Moreover, we supposed that these genes function systematically as biomarkers in long-time allograft rejection. Further validation in broad transplant population would be required before the non-invasive biomarkers can be generally utilized to predict the risk of transplant rejection. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Integrated bioinformatics analysis reveals key candidate genes and pathways in breast cancer.

    Science.gov (United States)

    Wang, Yuzhi; Zhang, Yi; Huang, Qian; Li, Chengwen

    2018-04-19

    Breast cancer (BC) is the leading malignancy in women worldwide, yet relatively little is known about the genes and signaling pathways involved in BC tumorigenesis and progression. The present study aimed to elucidate potential key candidate genes and pathways in BC. Five gene expression profile data sets (GSE22035, GSE3744, GSE5764, GSE21422 and GSE26910) were downloaded from the Gene Expression Omnibus (GEO) database, which included data from 113 tumorous and 38 adjacent non‑tumorous tissue samples. Differentially expressed genes (DEGs) were identified using t‑tests in the limma R package. These DEGs were subsequently investigated by pathway enrichment analysis and a protein‑protein interaction (PPI) network was constructed. The most significant module from the PPI network was selected for pathway enrichment analysis. In total, 227 DEGs were identified, of which 82 were upregulated and 145 were downregulated. Pathway enrichment analysis results revealed that the upregulated DEGs were mainly enriched in 'cell division', the 'proteinaceous extracellular matrix (ECM)', 'ECM structural constituents' and 'ECM‑receptor interaction', whereas downregulated genes were mainly enriched in 'response to drugs', 'extracellular space', 'transcriptional activator activity' and the 'peroxisome proliferator‑activated receptor signaling pathway'. The PPI network contained 174 nodes and 1,257 edges. DNA topoisomerase 2‑a, baculoviral inhibitor of apoptosis repeat‑containing protein 5, cyclin‑dependent kinase 1, G2/mitotic‑specific cyclin‑B1 and kinetochore protein NDC80 homolog were identified as the top 5 hub genes. Furthermore, the genes in the most significant module were predominantly involved in 'mitotic nuclear division', 'mid‑body', 'protein binding' and 'cell cycle'. In conclusion, the DEGs, relative pathways and hub genes identified in the present study may aid in understanding of the molecular mechanisms underlying BC progression and provide

  16. Comparative study of human mitochondrial proteome reveals extensive protein subcellular relocalization after gene duplications

    Directory of Open Access Journals (Sweden)

    Huang Yong

    2009-11-01

    Full Text Available Abstract Background Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families. Results The 456 human mitochondrial proteins selected for this study were clustered into 305 gene families including 92 multigene families. Among the multigene families, 59 (64% consisted of both mitochondrial and cytosolic (non-mitochondrial proteins (mt-cy families while the remaining 33 (36% were composed of mitochondrial proteins (mt-mt families. Phylogenetic analyses of mt-cy families revealed three different scenarios of their neolocalization following gene duplication: 1 relocalization from mitochondria to cytosol, 2 from cytosol to mitochondria and 3 multiple subcellular relocalizations. The neolocalizations were most commonly enabled by the gain or loss of N-terminal mitochondrial targeting signals. The majority of detected subcellular relocalization events occurred early in animal evolution, preceding the evolution of tetrapods. Mt-mt protein families showed a somewhat different pattern, where gene duplication occurred more evenly in time. However, for both types of protein families, most duplication events appear to roughly coincide with two rounds of genome duplications early in vertebrate evolution. Finally, we evaluated the effects of inaccurate and incomplete annotation of mitochondrial proteins and found that our conclusion of the importance of subcellular relocalization after gene duplication on

  17. Genome-wide analysis of gene expression in primate taste buds reveals links to diverse processes.

    Directory of Open Access Journals (Sweden)

    Peter Hevezi

    Full Text Available Efforts to unravel the mechanisms underlying taste sensation (gustation have largely focused on rodents. Here we present the first comprehensive characterization of gene expression in primate taste buds. Our findings reveal unique new insights into the biology of taste buds. We generated a taste bud gene expression database using laser capture microdissection (LCM procured fungiform (FG and circumvallate (CV taste buds from primates. We also used LCM to collect the top and bottom portions of CV taste buds. Affymetrix genome wide arrays were used to analyze gene expression in all samples. Known taste receptors are preferentially expressed in the top portion of taste buds. Genes associated with the cell cycle and stem cells are preferentially expressed in the bottom portion of taste buds, suggesting that precursor cells are located there. Several chemokines including CXCL14 and CXCL8 are among the highest expressed genes in taste buds, indicating that immune system related processes are active in taste buds. Several genes expressed specifically in endocrine glands including growth hormone releasing hormone and its receptor are also strongly expressed in taste buds, suggesting a link between metabolism and taste. Cell type-specific expression of transcription factors and signaling molecules involved in cell fate, including KIT, reveals the taste bud as an active site of cell regeneration, differentiation, and development. IKBKAP, a gene mutated in familial dysautonomia, a disease that results in loss of taste buds, is expressed in taste cells that communicate with afferent nerve fibers via synaptic transmission. This database highlights the power of LCM coupled with transcriptional profiling to dissect the molecular composition of normal tissues, represents the most comprehensive molecular analysis of primate taste buds to date, and provides a foundation for further studies in diverse aspects of taste biology.

  18. Completeness, supervenience and ontology

    International Nuclear Information System (INIS)

    Maudlin, Tim W E

    2007-01-01

    In 1935, Einstein, Podolsky and Rosen raised the issue of the completeness of the quantum description of a physical system. What they had in mind is whether or not the quantum description is informationally complete, in that all physical features of a system can be recovered from it. In a collapse theory such as the theory of Ghirardi, Rimini and Weber, the quantum wavefunction is informationally complete, and this has often been taken to suggest that according to that theory the wavefunction is all there is. If we distinguish the ontological completeness of a description from its informational completeness, we can see that the best interpretations of the GRW theory must postulate more physical ontology than just the wavefunction

  19. Completeness, supervenience and ontology

    Energy Technology Data Exchange (ETDEWEB)

    Maudlin, Tim W E [Department of Philosophy, Rutgers University, 26 Nichol Avenue, New Brunswick, NJ 08901-1411 (United States)

    2007-03-23

    In 1935, Einstein, Podolsky and Rosen raised the issue of the completeness of the quantum description of a physical system. What they had in mind is whether or not the quantum description is informationally complete, in that all physical features of a system can be recovered from it. In a collapse theory such as the theory of Ghirardi, Rimini and Weber, the quantum wavefunction is informationally complete, and this has often been taken to suggest that according to that theory the wavefunction is all there is. If we distinguish the ontological completeness of a description from its informational completeness, we can see that the best interpretations of the GRW theory must postulate more physical ontology than just the wavefunction.

  20. Gene expression analysis reveals new possible mechanisms of vancomycin-induced nephrotoxicity and identifies gene markers candidates.

    Science.gov (United States)

    Dieterich, Christine; Puey, Angela; Lin, Sylvia; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C; Ng, Hanna H

    2009-01-01

    Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and sacrificed on day 8. Clinical chemistry and histopathology demonstrated kidney injury at 400 mg/kg only. Hierarchical clustering analysis revealed that kidney gene expression profiles of all mice treated at 400 mg/kg clustered with those of mice administered 200 mg/kg iv. Transcriptional profiling might thus be more sensitive than current clinical markers for detecting kidney damage, though the profiles can differ with the route of administration. Analysis of transcripts whose expression was changed by at least twofold compared with vehicle saline after high iv and ip doses of vancomycin suggested the possibility of oxidative stress and mitochondrial damage in vancomycin-induced toxicity. In addition, our data showed changes in expression of several transcripts from the complement and inflammatory pathways. Such expression changes were confirmed by relative real-time reverse transcription-polymerase chain reaction. Finally, our results further substantiate the use of gene markers of kidney toxicity such as KIM-1/Havcr1, as indicators of renal injury.

  1. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.

    Science.gov (United States)

    Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K

    2013-12-17

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.

  2. The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system

    Science.gov (United States)

    Vonk, Freek J.; Casewell, Nicholas R.; Henkel, Christiaan V.; Heimberg, Alysha M.; Jansen, Hans J.; McCleary, Ryan J. R.; Kerkkamp, Harald M. E.; Vos, Rutger A.; Guerreiro, Isabel; Calvete, Juan J.; Wüster, Wolfgang; Woods, Anthony E.; Logan, Jessica M.; Harrison, Robert A.; Castoe, Todd A.; de Koning, A. P. Jason; Pollock, David D.; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B.; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S.; Ribeiro, José M. C.; Arntzen, Jan W.; van den Thillart, Guido E. E. J. M.; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P.; Spaink, Herman P.; Duboule, Denis; McGlinn, Edwina; Kini, R. Manjunatha; Richardson, Michael K.

    2013-01-01

    Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection. PMID:24297900

  3. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  4. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Directory of Open Access Journals (Sweden)

    Sara Kangaspeska

    Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  5. Computational integration of homolog and pathway gene module expression reveals general stemness signatures.

    Directory of Open Access Journals (Sweden)

    Martina Koeva

    Full Text Available The stemness hypothesis states that all stem cells use common mechanisms to regulate self-renewal and multi-lineage potential. However, gene expression meta-analyses at the single gene level have failed to identify a significant number of genes selectively expressed by a broad range of stem cell types. We hypothesized that stemness may be regulated by modules of homologs. While the expression of any single gene within a module may vary from one stem cell type to the next, it is possible that the expression of the module as a whole is required so that the expression of different, yet functionally-synonymous, homologs is needed in different stem cells. Thus, we developed a computational method to test for stem cell-specific gene expression patterns from a comprehensive collection of 49 murine datasets covering 12 different stem cell types. We identified 40 individual genes and 224 stemness modules with reproducible and specific up-regulation across multiple stem cell types. The stemness modules included families regulating chromatin remodeling, DNA repair, and Wnt signaling. Strikingly, the majority of modules represent evolutionarily related homologs. Moreover, a score based on the discovered modules could accurately distinguish stem cell-like populations from other cell types in both normal and cancer tissues. This scoring system revealed that both mouse and human metastatic populations exhibit higher stemness indices than non-metastatic populations, providing further evidence for a stem cell-driven component underlying the transformation to metastatic disease.

  6. Chicken genome analysis reveals novel genes encoding biotin-binding proteins related to avidin family

    Directory of Open Access Journals (Sweden)

    Nordlund Henri R

    2005-03-01

    Full Text Available Abstract Background A chicken egg contains several biotin-binding proteins (BBPs, whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins.

  7. LOGISTICS OPTIMIZATION USING ONTOLOGIES

    OpenAIRE

    Hendi , Hayder; Ahmad , Adeel; Bouneffa , Mourad; Fonlupt , Cyril

    2014-01-01

    International audience; Logistics processes involve complex physical flows and integration of different elements. It is widely observed that the uncontrolled processes can decline the state of logistics. The optimization of logistic processes can support the desired growth and consistent continuity of logistics. In this paper, we present a software framework for logistic processes optimization. It primarily defines logistic ontologies and then optimize them. It intends to assist the design of...

  8. A unified anatomy ontology of the vertebrate skeletal system.

    Directory of Open Access Journals (Sweden)

    Wasila M Dahdul

    Full Text Available The skeleton is of fundamental importance in research in comparative vertebrate morphology, paleontology, biomechanics, developmental biology, and systematics. Motivated by research questions that require computational access to and comparative reasoning across the diverse skeletal phenotypes of vertebrates, we developed a module of anatomical concepts for the skeletal system, the Vertebrate Skeletal Anatomy Ontology (VSAO, to accommodate and unify the existing skeletal terminologies for the species-specific (mouse, the frog Xenopus, zebrafish and multispecies (teleost, amphibian vertebrate anatomy ontologies. Previous differences between these terminologies prevented even simple queries across databases pertaining to vertebrate morphology. This module of upper-level and specific skeletal terms currently includes 223 defined terms and 179 synonyms that integrate skeletal cells, tissues, biological processes, organs (skeletal elements such as bones and cartilages, and subdivisions of the skeletal system. The VSAO is designed to integrate with other ontologies, including the Common Anatomy Reference Ontology (CARO, Gene Ontology (GO, Uberon, and Cell Ontology (CL, and it is freely available to the community to be updated with additional terms required for research. Its structure accommodates anatomical variation among vertebrate species in development, structure, and composition. Annotation of diverse vertebrate phenotypes with this ontology will enable novel inquiries across the full spectrum of phenotypic diversity.

  9. A unified anatomy ontology of the vertebrate skeletal system.

    Science.gov (United States)

    Dahdul, Wasila M; Balhoff, James P; Blackburn, David C; Diehl, Alexander D; Haendel, Melissa A; Hall, Brian K; Lapp, Hilmar; Lundberg, John G; Mungall, Christopher J; Ringwald, Martin; Segerdell, Erik; Van Slyke, Ceri E; Vickaryous, Matthew K; Westerfield, Monte; Mabee, Paula M

    2012-01-01

    The skeleton is of fundamental importance in research in comparative vertebrate morphology, paleontology, biomechanics, developmental biology, and systematics. Motivated by research questions that require computational access to and comparative reasoning across the diverse skeletal phenotypes of vertebrates, we developed a module of anatomical concepts for the skeletal system, the Vertebrate Skeletal Anatomy Ontology (VSAO), to accommodate and unify the existing skeletal terminologies for the species-specific (mouse, the frog Xenopus, zebrafish) and multispecies (teleost, amphibian) vertebrate anatomy ontologies. Previous differences between these terminologies prevented even simple queries across databases pertaining to vertebrate morphology. This module of upper-level and specific skeletal terms currently includes 223 defined terms and 179 synonyms that integrate skeletal cells, tissues, biological processes, organs (skeletal elements such as bones and cartilages), and subdivisions of the skeletal system. The VSAO is designed to integrate with other ontologies, including the Common Anatomy Reference Ontology (CARO), Gene Ontology (GO), Uberon, and Cell Ontology (CL), and it is freely available to the community to be updated with additional terms required for research. Its structure accommodates anatomical variation among vertebrate species in development, structure, and composition. Annotation of diverse vertebrate phenotypes with this ontology will enable novel inquiries across the full spectrum of phenotypic diversity.

  10. A Unified Anatomy Ontology of the Vertebrate Skeletal System

    Science.gov (United States)

    Dahdul, Wasila M.; Balhoff, James P.; Blackburn, David C.; Diehl, Alexander D.; Haendel, Melissa A.; Hall, Brian K.; Lapp, Hilmar; Lundberg, John G.; Mungall, Christopher J.; Ringwald, Martin; Segerdell, Erik; Van Slyke, Ceri E.; Vickaryous, Matthew K.; Westerfield, Monte; Mabee, Paula M.

    2012-01-01

    The skeleton is of fundamental importance in research in comparative vertebrate morphology, paleontology, biomechanics, developmental biology, and systematics. Motivated by research questions that require computational access to and comparative reasoning across the diverse skeletal phenotypes of vertebrates, we developed a module of anatomical concepts for the skeletal system, the Vertebrate Skeletal Anatomy Ontology (VSAO), to accommodate and unify the existing skeletal terminologies for the species-specific (mouse, the frog Xenopus, zebrafish) and multispecies (teleost, amphibian) vertebrate anatomy ontologies. Previous differences between these terminologies prevented even simple queries across databases pertaining to vertebrate morphology. This module of upper-level and specific skeletal terms currently includes 223 defined terms and 179 synonyms that integrate skeletal cells, tissues, biological processes, organs (skeletal elements such as bones and cartilages), and subdivisions of the skeletal system. The VSAO is designed to integrate with other ontologies, including the Common Anatomy Reference Ontology (CARO), Gene Ontology (GO), Uberon, and Cell Ontology (CL), and it is freely available to the community to be updated with additional terms required for research. Its structure accommodates anatomical variation among vertebrate species in development, structure, and composition. Annotation of diverse vertebrate phenotypes with this ontology will enable novel inquiries across the full spectrum of phenotypic diversity. PMID:23251424

  11. Dissection of a locus on mouse chromosome 5 reveals arthritis promoting and inhibitory genes

    DEFF Research Database (Denmark)

    Lindvall, Therese; Karlsson, Jenny; Holmdahl, Rikard

    2009-01-01

    with Eae39 congenic- and sub-interval congenic mice, carrying RIIIS/J genes on the B10.RIII genetic background, revealed three loci within Eae39 that control disease and anti-collagen antibody titers. Two of the loci promoted disease and the third locus was protecting from collagen induced arthritis...... development. By further breeding of mice with small congenic fragments, we identified a 3.2 Megabasepair (Mbp) interval that regulates disease. CONCLUSIONS: Disease promoting- and protecting genes within the Eae39 locus on mouse chromosome 5, control susceptibility to collagen induced arthritis. A disease......-protecting locus in the telomeric part of Eae39 results in lower anti-collagen antibody responses. The study shows the importance of breeding sub-congenic mouse strains to reveal genetic effects on complex diseases....

  12. Comprehensive Gene Expression Profiling Reveals Synergistic Functional Networks in Cerebral Vessels after Hypertension or Hypercholesterolemia

    Science.gov (United States)

    Ong, Wei-Yi; Ng, Mary Pei-Ern; Loke, Sau-Yeen; Jin, Shalai; Wu, Ya-Jun; Tanaka, Kazuhiro; Wong, Peter Tsun-Hon

    2013-01-01

    Atherosclerotic stenosis of cerebral arteries or intracranial large artery disease (ICLAD) is a major cause of stroke especially in Asians, Hispanics and Africans, but relatively little is known about gene expression changes in vessels at risk. This study compares comprehensive gene expression profiles in the middle cerebral artery (MCA) of New Zealand White rabbits exposed to two stroke risk factors i.e. hypertension and/or hypercholesterolemia, by the 2-Kidney-1-Clip method, or dietary supplementation with cholesterol. Microarray and Ingenuity Pathway Analyses of the MCA of the hypertensive rabbits showed up-regulated genes in networks containing the node molecules: UBC (ubiquitin), P38 MAPK, ERK, NFkB, SERPINB2, MMP1 and APP (amyloid precursor protein); and down-regulated genes related to MAPK, ERK 1/2, Akt, 26 s proteasome, histone H3 and UBC. The MCA of hypercholesterolemic rabbits showed differentially expressed genes that are surprisingly, linked to almost the same node molecules as the hypertensive rabbits, despite a relatively low percentage of ‘common genes’ (21 and 7%) between the two conditions. Up-regulated common genes were related to: UBC, SERPINB2, TNF, HNF4A (hepatocyte nuclear factor 4A) and APP, and down-regulated genes, related to UBC. Increased HNF4A message and protein were verified in the aorta. Together, these findings reveal similar nodal molecules and gene pathways in cerebral vessels affected by hypertension or hypercholesterolemia, which could be a basis for synergistic action of risk factors in the pathogenesis of ICLAD. PMID:23874591

  13. Feasibility of automated foundational ontology interchangeability

    CSIR Research Space (South Africa)

    Khan, ZC

    2014-11-01

    Full Text Available the Source Domain Ontology (sOd), with the domain knowledge com- ponent of the source ontology, the Source Foundational Ontology (sOf ) that is the foundational ontology component of the source ontology that is to be interchanged, and any equivalence... or subsumption mappings between enti- ties in sOd and sOf . – The Target Ontology (tO) which has been interchanged, which comprises the Target Domain Ontology (tOd), with the domain knowledge component of the target ontology, and the Target Foundational Ontology...

  14. An Ontology for Software Engineering Education

    Science.gov (United States)

    Ling, Thong Chee; Jusoh, Yusmadi Yah; Adbullah, Rusli; Alwi, Nor Hayati

    2013-01-01

    Software agents communicate using ontology. It is important to build an ontology for specific domain such as Software Engineering Education. Building an ontology from scratch is not only hard, but also incur much time and cost. This study aims to propose an ontology through adaptation of the existing ontology which is originally built based on a…

  15. A Frame Work for Ontological Privacy Preserved Mining

    OpenAIRE

    Sriman Narayana Iyengar. N.Ch.; Geetha Mary. A

    2010-01-01

    Data Mining analyses the stocked data and helps in foretelling the future trends. There are different techniques by which data can be mined. These different techniques reveal different types of hiddenknowledge. Using the right procedure of technique result specific patterns emerge.Ontology is a specification of conceptualization. It is a description of concepts and relationships that can exist for an agent or a community of agents. To make software more user-friendly, ontology could be used t...

  16. Metatranscriptomics reveals the diversity of genes expressed by eukaryotes in forest soils.

    Directory of Open Access Journals (Sweden)

    Coralie Damon

    Full Text Available Eukaryotic organisms play essential roles in the biology and fertility of soils. For example the micro and mesofauna contribute to the fragmentation and homogenization of plant organic matter, while its hydrolysis is primarily performed by the fungi. To get a global picture of the activities carried out by soil eukaryotes we sequenced 2×10,000 cDNAs synthesized from polyadenylated mRNA directly extracted from soils sampled in beech (Fagus sylvatica and spruce (Picea abies forests. Taxonomic affiliation of both cDNAs and 18S rRNA sequences showed a dominance of sequences from fungi (up to 60% and metazoans while protists represented less than 12% of the 18S rRNA sequences. Sixty percent of cDNA sequences from beech forest soil and 52% from spruce forest soil had no homologs in the GenBank/EMBL/DDJB protein database. A Gene Ontology term was attributed to 39% and 31.5% of the spruce and beech soil sequences respectively. Altogether 2076 sequences were putative homologs to different enzyme classes participating to 129 KEGG pathways among which several were implicated in the utilisation of soil nutrients such as nitrogen (ammonium, amino acids, oligopeptides, sugars, phosphates and sulfate. Specific annotation of plant cell wall degrading enzymes identified enzymes active on major polymers (cellulose, hemicelluloses, pectin, lignin and glycoside hydrolases represented 0.5% (beech soil-0.8% (spruce soil of the cDNAs. Other sequences coding enzymes active on organic matter (extracellular proteases, lipases, a phytase, P450 monooxygenases were identified, thus underlining the biotechnological potential of eukaryotic metatranscriptomes. The phylogenetic affiliation of 12 full-length carbohydrate active enzymes showed that most of them were distantly related to sequences from known fungi. For example, a putative GH45 endocellulase was closely associated to molluscan sequences, while a GH7 cellobiohydrolase was closest to crustacean sequences, thus

  17. Comparative expression profiling reveals gene functions in female meiosis and gametophyte development in Arabidopsis.

    Science.gov (United States)

    Zhao, Lihua; He, Jiangman; Cai, Hanyang; Lin, Haiyan; Li, Yanqiang; Liu, Renyi; Yang, Zhenbiao; Qin, Yuan

    2014-11-01

    Megasporogenesis is essential for female fertility, and requires the accomplishment of meiosis and the formation of functional megaspores. The inaccessibility and low abundance of female meiocytes make it particularly difficult to elucidate the molecular basis underlying megasporogenesis. We used high-throughput tag-sequencing analysis to identify genes expressed in female meiocytes (FMs) by comparing gene expression profiles from wild-type ovules undergoing megasporogenesis with those from the spl mutant ovules, which lack megasporogenesis. A total of 862 genes were identified as FMs, with levels that are consistently reduced in spl ovules in two biological replicates. Fluorescence-assisted cell sorting followed by RNA-seq analysis of DMC1:GFP-labeled female meiocytes confirmed that 90% of the FMs are indeed detected in the female meiocyte protoplast profiling. We performed reverse genetic analysis of 120 candidate genes and identified four FM genes with a function in female meiosis progression in Arabidopsis. We further revealed that KLU, a putative cytochrome P450 monooxygenase, is involved in chromosome pairing during female meiosis, most likely by affecting the normal expression pattern of DMC1 in ovules during female meiosis. Our studies provide valuable information for functional genomic analyses of plant germline development as well as insights into meiosis. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  18. Genetic and epigenetic variation in 5S ribosomal RNA genes reveals genome dynamics in Arabidopsis thaliana.

    Science.gov (United States)

    Simon, Lauriane; Rabanal, Fernando A; Dubos, Tristan; Oliver, Cecilia; Lauber, Damien; Poulet, Axel; Vogt, Alexander; Mandlbauer, Ariane; Le Goff, Samuel; Sommer, Andreas; Duborjal, Hervé; Tatout, Christophe; Probst, Aline V

    2018-04-06

    Organized in tandem repeat arrays in most eukaryotes and transcribed by RNA polymerase III, expression of 5S rRNA genes is under epigenetic control. To unveil mechanisms of transcriptional regulation, we obtained here in depth sequence information on 5S rRNA genes from the Arabidopsis thaliana genome and identified differential enrichment in epigenetic marks between the three 5S rDNA loci situated on chromosomes 3, 4 and 5. We reveal the chromosome 5 locus as the major source of an atypical, long 5S rRNA transcript characteristic of an open chromatin structure. 5S rRNA genes from this locus translocated in the Landsberg erecta ecotype as shown by linkage mapping and chromosome-specific FISH analysis. These variations in 5S rDNA locus organization cause changes in the spatial arrangement of chromosomes in the nucleus. Furthermore, 5S rRNA gene arrangements are highly dynamic with alterations in chromosomal positions through translocations in certain mutants of the RNA-directed DNA methylation pathway and important copy number variations among ecotypes. Finally, variations in 5S rRNA gene sequence, chromatin organization and transcripts indicate differential usage of 5S rDNA loci in distinct ecotypes. We suggest that both the usage of existing and new 5S rDNA loci resulting from translocations may impact neighboring chromatin organization.

  19. Potential translational targets revealed by linking mouse grooming behavioral phenotypes to gene expression using public databases.

    Science.gov (United States)

    Roth, Andrew; Kyzar, Evan J; Cachat, Jonathan; Stewart, Adam Michael; Green, Jeremy; Gaikwad, Siddharth; O'Leary, Timothy P; Tabakoff, Boris; Brown, Richard E; Kalueff, Allan V

    2013-01-10

    Rodent self-grooming is an important, evolutionarily conserved behavior, highly sensitive to pharmacological and genetic manipulations. Mice with aberrant grooming phenotypes are currently used to model various human disorders. Therefore, it is critical to understand the biology of grooming behavior, and to assess its translational validity to humans. The present in-silico study used publicly available gene expression and behavioral data obtained from several inbred mouse strains in the open-field, light-dark box, elevated plus- and elevated zero-maze tests. As grooming duration differed between strains, our analysis revealed several candidate genes with significant correlations between gene expression in the brain and grooming duration. The Allen Brain Atlas, STRING, GoMiner and Mouse Genome Informatics databases were used to functionally map and analyze these candidate mouse genes against their human orthologs, assessing the strain ranking of their expression and the regional distribution of expression in the mouse brain. This allowed us to identify an interconnected network of candidate genes (which have expression levels that correlate with grooming behavior), display altered patterns of expression in key brain areas related to grooming, and underlie important functions in the brain. Collectively, our results demonstrate the utility of large-scale, high-throughput data-mining and in-silico modeling for linking genomic and behavioral data, as well as their potential to identify novel neural targets for complex neurobehavioral phenotypes, including grooming. Copyright © 2012 Elsevier Inc. All rights reserved.

  20. Signature gene expression reveals novel clues to the molecular mechanisms of dimorphic transition in Penicillium marneffei.

    Directory of Open Access Journals (Sweden)

    Ence Yang

    2014-10-01

    Full Text Available Systemic dimorphic fungi cause more than one million new infections each year, ranking them among the significant public health challenges currently encountered. Penicillium marneffei is a systemic dimorphic fungus endemic to Southeast Asia. The temperature-dependent dimorphic phase transition between mycelium and yeast is considered crucial for the pathogenicity and transmission of P. marneffei, but the underlying mechanisms are still poorly understood. Here, we re-sequenced P. marneffei strain PM1 using multiple sequencing platforms and assembled the genome using hybrid genome assembly. We determined gene expression levels using RNA sequencing at the mycelial and yeast phases of P. marneffei, as well as during phase transition. We classified 2,718 genes with variable expression across conditions into 14 distinct groups, each marked by a signature expression pattern implicated at a certain stage in the dimorphic life cycle. Genes with the same expression patterns tend to be clustered together on the genome, suggesting orchestrated regulations of the transcriptional activities of neighboring genes. Using qRT-PCR, we validated expression levels of all genes in one of clusters highly expressed during the yeast-to-mycelium transition. These included madsA, a gene encoding MADS-box transcription factor whose gene family is exclusively expanded in P. marneffei. Over-expression of madsA drove P. marneffei to undergo mycelial growth at 37°C, a condition that restricts the wild-type in the yeast phase. Furthermore, analyses of signature expression patterns suggested diverse roles of secreted proteins at different developmental stages and the potential importance of non-coding RNAs in mycelium-to-yeast transition. We also showed that RNA structural transition in response to temperature changes may be related to the control of thermal dimorphism. Together, our findings have revealed multiple molecular mechanisms that may underlie the dimorphic transition

  1. Fine Mapping and Transcriptome Analysis Reveal Candidate Genes Associated with Hybrid Lethality in Cabbage (Brassica Oleracea).

    Science.gov (United States)

    Xiao, Zhiliang; Hu, Yang; Zhang, Xiaoli; Xue, Yuqian; Fang, Zhiyuan; Yang, Limei; Zhang, Yangyong; Liu, Yumei; Li, Zhansheng; Liu, Xing; Liu, Zezhou; Lv, Honghao; Zhuang, Mu

    2017-06-05

    Hybrid lethality is a deleterious phenotype that is vital to species evolution. We previously reported hybrid lethality in cabbage ( Brassica oleracea ) and performed preliminary mapping of related genes. In the present study, the fine mapping of hybrid lethal genes revealed that BoHL1 was located on chromosome C1 between BoHLTO124 and BoHLTO130, with an interval of 101 kb. BoHL2 was confirmed to be between insertion-deletion (InDels) markers HL234 and HL235 on C4, with a marker interval of 70 kb. Twenty-eight and nine annotated genes were found within the two intervals of BoHL1 and BoHL2 , respectively. We also applied RNA-Seq to analyze hybrid lethality in cabbage. In the region of BoHL1 , seven differentially expressed genes (DEGs) and five resistance (R)-related genes (two in common, i.e., Bo1g153320 and Bo1g153380 ) were found, whereas in the region of BoHL2 , two DEGs and four R-related genes (two in common, i.e., Bo4g173780 and Bo4g173810 ) were found. Along with studies in which R genes were frequently involved in hybrid lethality in other plants, these interesting R-DEGs may be good candidates associated with hybrid lethality. We also used SNP/InDel analyses and quantitative real-time PCR to confirm the results. This work provides new insight into the mechanisms of hybrid lethality in cabbage.

  2. Characterization of the biocontrol activity of pseudomonas fluorescens strain X reveals novel genes regulated by glucose.

    Directory of Open Access Journals (Sweden)

    Gerasimos F Kremmydas

    Full Text Available Pseudomonas fluorescens strain X, a bacterial isolate from the rhizosphere of bean seedlings, has the ability to suppress damping-off caused by the oomycete Pythium ultimum. To determine the genes controlling the biocontrol activity of strain X, transposon mutagenesis, sequencing and complementation was performed. Results indicate that, biocontrol ability of this isolate is attributed to gcd gene encoding glucose dehydrogenase, genes encoding its co-enzyme pyrroloquinoline quinone (PQQ, and two genes (sup5 and sup6 which seem to be organized in a putative operon. This operon (named supX consists of five genes, one of which encodes a non-ribosomal peptide synthase. A unique binding site for a GntR-type transcriptional factor is localized upstream of the supX putative operon. Synteny comparison of the genes in supX revealed that they are common in the genus Pseudomonas, but with a low degree of similarity. supX shows high similarity only to the mangotoxin operon of Ps. syringae pv. syringae UMAF0158. Quantitative real-time PCR analysis indicated that transcription of supX is strongly reduced in the gcd and PQQ-minus mutants of Ps. fluorescens strain X. On the contrary, transcription of supX in the wild type is enhanced by glucose and transcription levels that appear to be higher during the stationary phase. Gcd, which uses PQQ as a cofactor, catalyses the oxidation of glucose to gluconic acid, which controls the activity of the GntR family of transcriptional factors. The genes in the supX putative operon have not been implicated before in the biocontrol of plant pathogens by pseudomonads. They are involved in the biosynthesis of an antimicrobial compound by Ps. fluorescens strain X and their transcription is controlled by glucose, possibly through the activity of a GntR-type transcriptional factor binding upstream of this putative operon.

  3. Gene array analysis of PD-1H overexpressing monocytes reveals a pro-inflammatory profile

    Directory of Open Access Journals (Sweden)

    Preeti Bharaj

    2018-02-01

    Full Text Available We have previously reported that overexpression of Programmed Death -1 Homolog (PD-1H in human monocytes leads to activation and spontaneous secretion of multiple pro inflammatory cytokines. Here we evaluate changes in monocytes gene expression after enforced PD-1H expression by gene array. The results show that there are significant alterations in 51 potential candidate genes that relate to immune response, cell adhesion and metabolism. Genes corresponding to pro-inflammatory cytokines showed the highest upregulation, 7, 3.2, 3.0, 5.8, 4.4 and 3.1 fold upregulation of TNF-α, IL-1 β, IFN-α, γ, λ and IL-27 relative to vector control. The data are in agreement with cytometric bead array analysis showing induction of proinflammatory cytokines, IL-6, IL-1β and TNF-α by PD-1H. Other genes related to inflammation, include transglutaminase 2 (TG2, NF-κB (p65 and p50 and toll like receptors (TLR 3 and 4 were upregulated 5, 4.5 and 2.5 fold, respectively. Gene set enrichment analysis (GSEA also revealed that signaling pathways related to inflammatory response, such as NFκB, AT1R, PYK2, MAPK, RELA, TNFR1, MTOR and proteasomal degradation, were significantly upregulated in response to PD-1H overexpression. We validated the results utilizing a standard inflammatory sepsis model in humanized BLT mice, finding that PD-1H expression was highly correlated with proinflammatory cytokine production. We therefore conclude that PD-1H functions to enhance monocyte activation and the induction of a pro-inflammatory gene expression profile.

  4. Drug target ontology to classify and integrate drug discovery data

    DEFF Research Database (Denmark)

    Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande

    2017-01-01

    using a new software tool to auto-generate most axioms from a database while supporting manual knowledge acquisition. A modular, hierarchical implementation facilitate ontology development and maintenance and makes use of various external ontologies, thus integrating the DTO into the ecosystem...... of biomedical ontologies. As a formal OWL-DL ontology, DTO contains asserted and inferred axioms. Modeling data from the Library of Integrated Network-based Cellular Signatures (LINCS) program illustrates the potential of DTO for contextual data integration and nuanced definition of important drug target...... characteristics. DTO has been implemented in the IDG user interface Portal, Pharos and the TIN-X explorer of protein target disease relationships. CONCLUSIONS: DTO was built based on the need for a formal semantic model for druggable targets including various related information such as protein, gene, protein...

  5. Interaction between leptin and leptin receptor in gastric carcinoma: Gene ontology analysis Interacción entre la leptina y su receptor en el carcinoma gástrico: análisis de ontología genética

    Directory of Open Access Journals (Sweden)

    V. Wiwanitkit

    2007-04-01

    Full Text Available Gastric carcinoma is a rare but important malignancy. The link between leptin, a cytokine that is elevated in obese individuals, and cancer development has been proposed. It is noted that leptin and its receptor may play a positive role in the progression in gastric cancer. However, the exact mechanism resulting form the interaction between leptin and leptin receptor has never been clarified. Here, the author used a new gene ontology technology to predict the molecular function and biological process due to the interaction between leptin and leptin receptor. Comparing to leptin and leptin receptor, the leptin-leptin receptor poses the same function and biological process as leptin receptor. This can confirm that leptin receptor has a significant suppressive effect on the expression of leptin. Loss of hormone activity and disturbance of normal cell signaling pathway of leptin can be seen. Blocking of receptor might be rational therapeutic strategy.El carcinoma gástrico es un cáncer muy poco frecuente pero importante. Se ha postulado que la leptina, una citocina que aparece elevada en las personas obesas, está relacionada con el cáncer. Se sabe que la leptina y su receptor pueden desempeñar un papel positivo en la progresión del cáncer gástrico. Sin embargo, nunca se ha dilucidado el mecanismo exacto al que daría lugar la interacción entre la leptina y el receptor de leptina. Aquí, el autor empleó una nueva tecnología de ontología genética para predecir la función molecular y el proceso biológico resultantes de la interacción entre la leptina y su receptor. Frente a la leptina y su receptor, el compuesto leptina-receptor realiza la misma función y el mismo proceso biológico que el receptor de leptina. Esto puede confirmar que el receptor de leptina ejerce un importante efecto supresor sobre la expresión de leptina. Pueden observarse una pérdida de actividad hormonal y la alteración de la vía normal de señalización celular

  6. Gene Expression Profile Reveals Abnormalities of Multiple Signaling Pathways in Mesenchymal Stem Cell Derived from Patients with Systemic Lupus Erythematosus

    Directory of Open Access Journals (Sweden)

    Yu Tang

    2012-01-01

    Full Text Available We aimed to compare bone-marrow-derived mesenchymal stem cells (BMMSCs between systemic lupus erythematosus (SLE and normal controls by means of cDNA microarray, immunohistochemistry, immunofluorescence, and immunoblotting. Our results showed there were a total of 1, 905 genes which were differentially expressed by BMMSCs derived from SLE patients, of which, 652 genes were upregulated and 1, 253 were downregulated. Gene ontology (GO analysis showed that the majority of these genes were related to cell cycle and protein binding. Pathway analysis exhibited that differentially regulated signal pathways involved actin cytoskeleton, focal adhesion, tight junction, and TGF-β pathway. The high protein level of BMP-5 and low expression of Id-1 indicated that there might be dysregulation in BMP/TGF-β signaling pathway. The expression of Id-1 in SLE BMMSCs was reversely correlated with serum TNF-α levels. The protein level of cyclin E decreased in the cell cycling regulation pathway. Moreover, the MAPK signaling pathway was activated in BMMSCs from SLE patients via phosphorylation of ERK1/2 and SAPK/JNK. The actin distribution pattern of BMMSCs from SLE patients was also found disordered. Our results suggested that there were distinguished differences of BMMSCs between SLE patients and normal controls.

  7. Ontologies and tag-statistics

    Science.gov (United States)

    Tibély, Gergely; Pollner, Péter; Vicsek, Tamás; Palla, Gergely

    2012-05-01

    Due to the increasing popularity of collaborative tagging systems, the research on tagged networks, hypergraphs, ontologies, folksonomies and other related concepts is becoming an important interdisciplinary area with great potential and relevance for practical applications. In most collaborative tagging systems the tagging by the users is completely ‘flat’, while in some cases they are allowed to define a shallow hierarchy for their own tags. However, usually no overall hierarchical organization of the tags is given, and one of the interesting challenges of this area is to provide an algorithm generating the ontology of the tags from the available data. In contrast, there are also other types of tagged networks available for research, where the tags are already organized into a directed acyclic graph (DAG), encapsulating the ‘is a sub-category of’ type of hierarchy between each other. In this paper, we study how this DAG affects the statistical distribution of tags on the nodes marked by the tags in various real networks. The motivation for this research was the fact that understanding the tagging based on a known hierarchy can help in revealing the hidden hierarchy of tags in collaborative tagging systems. We analyse the relation between the tag-frequency and the position of the tag in the DAG in two large sub-networks of the English Wikipedia and a protein-protein interaction network. We also study the tag co-occurrence statistics by introducing a two-dimensional (2D) tag-distance distribution preserving both the difference in the levels and the absolute distance in the DAG for the co-occurring pairs of tags. Our most interesting finding is that the local relevance of tags in the DAG (i.e. their rank or significance as characterized by, e.g., the length of the branches starting from them) is much more important than their global distance from the root. Furthermore, we also introduce a simple tagging model based on random walks on the DAG, capable of

  8. Ontologies and tag-statistics

    International Nuclear Information System (INIS)

    Tibély, Gergely; Vicsek, Tamás; Pollner, Péter; Palla, Gergely

    2012-01-01

    Due to the increasing popularity of collaborative tagging systems, the research on tagged networks, hypergraphs, ontologies, folksonomies and other related concepts is becoming an important interdisciplinary area with great potential and relevance for practical applications. In most collaborative tagging systems the tagging by the users is completely ‘flat’, while in some cases they are allowed to define a shallow hierarchy for their own tags. However, usually no overall hierarchical organization of the tags is given, and one of the interesting challenges of this area is to provide an algorithm generating the ontology of the tags from the available data. In contrast, there are also other types of tagged networks available for research, where the tags are already organized into a directed acyclic graph (DAG), encapsulating the ‘is a sub-category of’ type of hierarchy between each other. In this paper, we study how this DAG affects the statistical distribution of tags on the nodes marked by the tags in various real networks. The motivation for this research was the fact that understanding the tagging based on a known hierarchy can help in revealing the hidden hierarchy of tags in collaborative tagging systems. We analyse the relation between the tag-frequency and the position of the tag in the DAG in two large sub-networks of the English Wikipedia and a protein-protein interaction network. We also study the tag co-occurrence statistics by introducing a two-dimensional (2D) tag-distance distribution preserving both the difference in the levels and the absolute distance in the DAG for the co-occurring pairs of tags. Our most interesting finding is that the local relevance of tags in the DAG (i.e. their rank or significance as characterized by, e.g., the length of the branches starting from them) is much more important than their global distance from the root. Furthermore, we also introduce a simple tagging model based on random walks on the DAG, capable of

  9. Gene expression profiling reveals new potential players of gonad differentiation in the chicken embryo.

    Directory of Open Access Journals (Sweden)

    Gwenn-Aël Carré

    Full Text Available BACKGROUND: In birds as in mammals, a genetic switch determines whether the undifferentiated gonad develops into an ovary or a testis. However, understanding of the molecular pathway(s involved in gonad differentiation is still incomplete. METHODOLOGY/PRINCIPAL FINDINGS: With the aim of improving characterization of the molecular pathway(s involved in gonad differentiation in the chicken embryo, we developed a large scale real time reverse transcription polymerase chain reaction approach on 110 selected genes for evaluation of their expression profiles during chicken gonad differentiation between days 5.5 and 19 of incubation. Hierarchical clustering analysis of the resulting datasets discriminated gene clusters expressed preferentially in the ovary or the testis, and/or at early or later periods of embryonic gonad development. Fitting a linear model and testing the comparisons of interest allowed the identification of new potential actors of gonad differentiation, such as Z-linked ADAMTS12, LOC427192 (corresponding to NIM1 protein and CFC1, that are upregulated in the developing testis, and BMP3 and Z-linked ADAMTSL1, that are preferentially expressed in the developing ovary. Interestingly, the expression patterns of several members of the transforming growth factor β family were sexually dimorphic, with inhibin subunits upregulated in the testis, and bone morphogenetic protein subfamily members including BMP2, BMP3, BMP4 and BMP7, upregulated in the ovary. This study also highlighted several genes displaying asymmetric expression profiles such as GREM1 and BMP3 that are potentially involved in different aspects of gonad left-right asymmetry. CONCLUSION/SIGNIFICANCE: This study supports the overall conservation of vertebrate sex differentiation pathways but also reveals some particular feature of gene expression patterns during gonad development in the chicken. In particular, our study revealed new candidate genes which may be potential actors

  10. Gene Expression Profiling Reveals New Potential Players of Gonad Differentiation in the Chicken Embryo

    Science.gov (United States)

    Carré, Gwenn-Aël; Couty, Isabelle; Hennequet-Antier, Christelle; Govoroun, Marina S.

    2011-01-01

    Background In birds as in mammals, a genetic switch determines whether the undifferentiated gonad develops into an ovary or a testis. However, understanding of the molecular pathway(s) involved in gonad differentiation is still incomplete. Methodology/Principal Findings With the aim of improving characterization of the molecular pathway(s) involved in gonad differentiation in the chicken embryo, we developed a large scale real time reverse transcription polymerase chain reaction approach on 110 selected genes for evaluation of their expression profiles during chicken gonad differentiation between days 5.5 and 19 of incubation. Hierarchical clustering analysis of the resulting datasets discriminated gene clusters expressed preferentially in the ovary or the testis, and/or at early or later periods of embryonic gonad development. Fitting a linear model and testing the comparisons of interest allowed the identification of new potential actors of gonad differentiation, such as Z-linked ADAMTS12, LOC427192 (corresponding to NIM1 protein) and CFC1, that are upregulated in the developing testis, and BMP3 and Z-linked ADAMTSL1, that are preferentially expressed in the developing ovary. Interestingly, the expression patterns of several members of the transforming growth factor β family were sexually dimorphic, with inhibin subunits upregulated in the testis, and bone morphogenetic protein subfamily members including BMP2, BMP3, BMP4 and BMP7, upregulated in the ovary. This study also highlighted several genes displaying asymmetric expression profiles such as GREM1 and BMP3 that are potentially involved in different aspects of gonad left-right asymmetry. Conclusion/Significance This study supports the overall conservation of vertebrate sex differentiation pathways but also reveals some particular feature of gene expression patterns during gonad development in the chicken. In particular, our study revealed new candidate genes which may be potential actors of chicken gonad

  11. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    Science.gov (United States)

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. ONSET: Automated foundational ontology selection and explanation

    CSIR Research Space (South Africa)

    Khan, Z

    2012-10-01

    Full Text Available It has been shown that using a foundational ontology for domain ontology development is beneficial in theory and practice. However, developers have difficulty with choosing the appropriate foundational ontology, and why. In order to solve...

  13. Human mast cell tryptase: Multiple cDNAs and genes reveal a multigene serine protease family

    International Nuclear Information System (INIS)

    Vanderslice, P.; Ballinger, S.M.; Tam, E.K.; Goldstein, S.M.; Craik, C.S.; Caughey, G.H.

    1990-01-01

    Three different cDNAs and a gene encoding human skin mast cell tryptase have been cloned and sequenced in their entirety. The deduced amino acid sequences reveal a 30-amino acid prepropeptide followed by a 245-amino acid catalytic domain. The C-terminal undecapeptide of the human preprosequence is identical in dog tryptase and appears to be part of a prosequence unique among serine proteases. The differences among the three human tryptase catalytic domains include the loss of a consensus N-glycosylation site in one cDNA, which may explain some of the heterogeneity in size and susceptibility to deglycosylation seen in tryptase preparations. All three tryptase cDNAs are distinct from a recently reported cDNA obtained from a human lung mast cell library. A skin tryptase cDNA was used to isolate a human tryptase gene, the exons of which match one of the skin-derived cDNAs. The organization of the ∼1.8-kilobase-pair tryptase gene is unique and is not closely related to that of any other mast cell or leukocyte serine protease. The 5' regulatory regions of the gene share features with those of other serine proteases, including mast cell chymase, but are unusual in being separated from the protein-coding sequence by an intron. High-stringency hybridization of a human genomic DNA blot with a fragment of the tryptase gene confirms the presence of multiple tryptase genes. These findings provide genetic evidence that human mast cell tryptases are the products of a multigene family

  14. Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae.

    Science.gov (United States)

    Gubala, Aneta J; Proll, David F; Barnard, Ross T; Cowled, Chris J; Crameri, Sandra G; Hyatt, Alex D; Boyle, David B

    2008-06-20

    Viruses belonging to the family Rhabdoviridae infect a variety of different hosts, including insects, vertebrates and plants. Currently, there are approximately 200 ICTV-recognised rhabdoviruses isolated around the world. However, the majority remain poorly characterised and only a fraction have been definitively assigned to genera. The genomic and transcriptional complexity displayed by several of the characterised rhabdoviruses indicates large diversity and complexity within this family. To enable an improved taxonomic understanding of this family, it is necessary to gain further information about the poorly characterised members of this family. Here we present the complete genome sequence and predicted transcription strategy of Wongabel virus (WONV), a previously uncharacterised rhabdovirus isolated from biting midges (Culicoides austropalpalis) collected in northern Queensland, Australia. The 13,196 nucleotide genome of WONV encodes five typical rhabdovirus genes N, P, M, G and L. In addition, the WONV genome contains three genes located between the P and M genes (U1, U2, U3) and two open reading frames overlapping with the N and G genes (U4, U5). These five additional genes and their putative protein products appear to be novel, and their functions are unknown. Predictive analysis of the U5 gene product revealed characteristics typical of viroporins, and indicated structural similarities with the alpha-1 protein (putative viroporin) of viruses in the genus Ephemerovirus. Phylogenetic analyses of the N and G proteins of WONV indicated closest similarity with the avian-associated Flanders virus; however, the genomes of these two viruses are significantly diverged. WONV displays a novel and unique genome structure that has not previously been described for any animal rhabdovirus.

  15. Comparative transcriptome analysis reveals differentially expressed genes associated with sex expression in garden asparagus (Asparagus officinalis).

    Science.gov (United States)

    Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gao, Wu-Jun

    2017-08-22

    Garden asparagus (Asparagus officinalis) is a highly valuable vegetable crop of commercial and nutritional interest. It is also commonly used to investigate the mechanisms of sex determination and differentiation in plants. However, the sex expression mechanisms in asparagus remain poorly understood. De novo transcriptome sequencing via Illumina paired-end sequencing revealed more than 26 billion bases of high-quality sequence data from male and female asparagus flower buds. A total of 72,626 unigenes with an average length of 979 bp were assembled. In comparative transcriptome analysis, 4876 differentially expressed genes (DEGs) were identified in the possible sex-determining stage of female and male/supermale flower buds. Of these DEGs, 433, including 285 male/supermale-biased and 149 female-biased genes, were annotated as flower related. Of the male/supermale-biased flower-related genes, 102 were probably involved in anther development. In addition, 43 DEGs implicated in hormone response and biosynthesis putatively associated with sex expression and reproduction were discovered. Moreover, 128 transcription factor (TF)-related genes belonging to various families were found to be differentially expressed, and this finding implied the essential roles of TF in sex determination or differentiation in asparagus. Correlation analysis indicated that miRNA-DEG pairs were also implicated in asparagus sexual development. Our study identified a large number of DEGs involved in the sex expression and reproduction of asparagus, including known genes participating in plant reproduction, plant hormone signaling, TF encoding, and genes with unclear functions. We also found that miRNAs might be involved in the sex differentiation process. Our study could provide a valuable basis for further investigations on the regulatory networks of sex determination and differentiation in asparagus and facilitate further genetic and genomic studies on this dioecious species.

  16. Transcriptome and proteome data reveal candidate genes for pollinator attraction in sexually deceptive orchids.

    Science.gov (United States)

    Sedeek, Khalid E M; Qi, Weihong; Schauer, Monica A; Gupta, Alok K; Poveda, Lucy; Xu, Shuqing; Liu, Zhong-Jian; Grossniklaus, Ueli; Schiestl, Florian P; Schlüter, Philipp M

    2013-01-01

    Sexually deceptive orchids of the genus Ophrys mimic the mating signals of their pollinator females to attract males as pollinators. This mode of pollination is highly specific and leads to strong reproductive isolation between species. This study aims to identify candidate genes responsible for pollinator attraction and reproductive isolation between three closely related species, O. exaltata, O. sphegodes and O. garganica. Floral traits such as odour, colour and morphology are necessary for successful pollinator attraction. In particular, different odour hydrocarbon profiles have been linked to differences in specific pollinator attraction among these species. Therefore, the identification of genes involved in these traits is important for understanding the molecular basis of pollinator attraction by sexually deceptive orchids. We have created floral reference transcriptomes and proteomes for these three Ophrys species using a combination of next-generation sequencing (454 and Solexa), Sanger sequencing, and shotgun proteomics (tandem mass spectrometry). In total, 121 917 unique transcripts and 3531 proteins were identified. This represents the first orchid proteome and transcriptome from the orchid subfamily Orchidoideae. Proteome data revealed proteins corresponding to 2644 transcripts and 887 proteins not observed in the transcriptome. Candidate genes for hydrocarbon and anthocyanin biosynthesis were represented by 156 and 61 unique transcripts in 20 and 7 genes classes, respectively. Moreover, transcription factors putatively involved in the regulation of flower odour, colour and morphology were annotated, including Myb, MADS and TCP factors. Our comprehensive data set generated by combining transcriptome and proteome technologies allowed identification of candidate genes for pollinator attraction and reproductive isolation among sexually deceptive orchids. This includes genes for hydrocarbon and anthocyanin biosynthesis and regulation, and the development of

  17. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  18. The Ontology for Biomedical Investigations.

    Science.gov (United States)

    Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias; Brush, Matthew H; Bug, Bill; Chibucos, Marcus C; Clancy, Kevin; Courtot, Mélanie; Derom, Dirk; Dumontier, Michel; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Gibson, Frank; Gonzalez-Beltran, Alejandra; Haendel, Melissa A; He, Yongqun; Heiskanen, Mervi; Hernandez-Boussard, Tina; Jensen, Mark; Lin, Yu; Lister, Allyson L; Lord, Phillip; Malone, James; Manduchi, Elisabetta; McGee, Monnie; Morrison, Norman; Overton, James A; Parkinson, Helen; Peters, Bjoern; Rocca-Serra, Philippe; Ruttenberg, Alan; Sansone, Susanna-Assunta; Scheuermann, Richard H; Schober, Daniel; Smith, Barry; Soldatova, Larisa N; Stoeckert, Christian J; Taylor, Chris F; Torniai, Carlo; Turner, Jessica A; Vita, Randi; Whetzel, Patricia L; Zheng, Jie

    2016-01-01

    The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, Chemical Entities of Biological Interest (ChEBI) and Phenotype Attribute and Trait Ontology (PATO) without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. OBI has also spawned other ontologies (Information Artifact Ontology) and methods for importing parts of ontologies (Minimum information to reference an external ontology term (MIREOT)). The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. To date, OBI has created 2366 classes and 40 relations along with textual and formal definitions. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed

  19. Signalling pathways involved in adult heart formation revealed by gene expression profiling in Drosophila.

    Directory of Open Access Journals (Sweden)

    Bruno Zeitouni

    2007-10-01

    Full Text Available Drosophila provides a powerful system for defining the complex genetic programs that drive organogenesis. Under control of the steroid hormone ecdysone, the adult heart in Drosophila forms during metamorphosis by a remodelling of the larval cardiac organ. Here, we evaluated the extent to which transcriptional signatures revealed by genomic approaches can provide new insights into the molecular pathways that underlie heart organogenesis. Whole-genome expression profiling at eight successive time-points covering adult heart formation revealed a highly dynamic temporal map of gene expression through 13 transcript clusters with distinct expression kinetics. A functional atlas of the transcriptome profile strikingly points to the genomic transcriptional response of the ecdysone cascade, and a sharp regulation of key components belonging to a few evolutionarily conserved signalling pathways. A reverse genetic analysis provided evidence that these specific signalling pathways are involved in discrete steps of adult heart formation. In particular, the Wnt signalling pathway is shown to participate in inflow tract and cardiomyocyte differentiation, while activation of the PDGF-VEGF pathway is required for cardiac valve formation. Thus, a detailed temporal map of gene expression can reveal signalling pathways responsible for specific developmental programs and provides here substantial grasp into heart formation.

  20. Circuit-wide Transcriptional Profiling Reveals Brain Region-Specific Gene Networks Regulating Depression Susceptibility.

    Science.gov (United States)

    Bagot, Rosemary C; Cates, Hannah M; Purushothaman, Immanuel; Lorsch, Zachary S; Walker, Deena M; Wang, Junshi; Huang, Xiaojie; Schlüter, Oliver M; Maze, Ian; Peña, Catherine J; Heller, Elizabeth A; Issler, Orna; Wang, Minghui; Song, Won-Min; Stein, Jason L; Liu, Xiaochuan; Doyle, Marie A; Scobie, Kimberly N; Sun, Hao Sheng; Neve, Rachael L; Geschwind, Daniel; Dong, Yan; Shen, Li; Zhang, Bin; Nestler, Eric J

    2016-06-01

    Depression is a complex, heterogeneous disorder and a leading contributor to the global burden of disease. Most previous research has focused on individual brain regions and genes contributing to depression. However, emerging evidence in humans and animal models suggests that dysregulated circuit function and gene expression across multiple brain regions drive depressive phenotypes. Here, we performed RNA sequencing on four brain regions from control animals and those susceptible or resilient to chronic social defeat stress at multiple time points. We employed an integrative network biology approach to identify transcriptional networks and key driver genes that regulate susceptibility to depressive-like symptoms. Further, we validated in vivo several key drivers and their associated transcriptional networks that regulate depression susceptibility and confirmed their functional significance at the levels of gene transcription, synaptic regulation, and behavior. Our study reveals novel transcriptional networks that control stress susceptibility and offers fundamentally new leads for antidepressant drug discovery. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Reliable and rapid characterization of functional FCN2 gene variants reveals diverse geographical patterns

    Directory of Open Access Journals (Sweden)

    Ojurongbe Olusola

    2012-05-01

    Full Text Available Abstract Background Ficolin-2 coded by FCN2 gene is a soluble serum protein and an innate immune recognition element of the complement system. FCN2 gene polymorphisms reveal distinct geographical patterns and are documented to alter serum ficolin levels and modulate disease susceptibility. Methods We employed a real-time PCR based on Fluorescence Resonance Energy Transfer (FRET method to genotype four functional SNPs including -986 G > A (#rs3124952, -602 G > A (#rs3124953, -4A > G (#rs17514136 and +6424 G > T (#rs7851696 in the ficolin-2 (FCN2 gene. We characterized the FCN2 variants in individuals representing Brazilian (n = 176, Nigerian (n = 180, Vietnamese (n = 172 and European Caucasian ethnicity (n = 165. Results We observed that the genotype distribution of three functional SNP variants (−986 G > A, -602 G > A and -4A > G differ significantly between the populations investigated (p p  Conclusions The observed distribution of the FCN2 functional SNP variants may likely contribute to altered serum ficolin levels and this may depend on the different disease settings in world populations. To conclude, the use of FRET based real-time PCR especially for FCN2 gene will benefit a larger scientific community who extensively depend on rapid, reliable method for FCN2 genotyping.

  2. Meta-Analysis of Transcriptome Data Related to Hippocampus Biopsies and iPSC-Derived Neuronal Cells from Alzheimer's Disease Patients Reveals an Association with FOXA1 and FOXA2 Gene Regulatory Networks.

    Science.gov (United States)

    Wruck, Wasco; Schröter, Friederike; Adjaye, James

    2016-01-01

    Although the incidence of Alzheimer's disease (AD) is continuously increasing in the aging population worldwide, effective therapies are not available. The interplay between causative genetic and environmental factors is partially understood. Meta-analyses have been performed on aspects such as polymorphisms, cytokines, and cognitive training. Here, we propose a meta-analysis approach based on hierarchical clustering analysis of a reliable training set of hippocampus biopsies, which is condensed to a gene expression signature. This gene expression signature was applied to various test sets of brain biopsies and iPSC-derived neuronal cell models to demonstrate its ability to distinguish AD samples from control. Thus, our identified AD-gene signature may form the basis for determination of biomarkers that are urgently needed to overcome current diagnostic shortfalls. Intriguingly, the well-described AD-related genes APP and APOE are not within the signature because their gene expression profiles show a lower correlation to the disease phenotype than genes from the signature. This is in line with the differing characteristics of the disease as early-/late-onset or with/without genetic predisposition. To investigate the gene signature's systemic role(s), signaling pathways, gene ontologies, and transcription factors were analyzed which revealed over-representation of response to stress, regulation of cellular metabolic processes, and reactive oxygen species. Additionally, our results clearly point to an important role of FOXA1 and FOXA2 gene regulatory networks in the etiology of AD. This finding is in corroboration with the recently reported major role of the dopaminergic system in the development of AD and its regulation by FOXA1 and FOXA2.

  3. Analysis of clock-regulated genes in Neurospora reveals widespread posttranscriptional control of metabolic potential

    Science.gov (United States)

    Hurley, Jennifer M.; Dasgupta, Arko; Emerson, Jillian M.; Zhou, Xiaoying; Ringelberg, Carol S.; Knabe, Nicole; Lipzen, Anna M.; Lindquist, Erika A.; Daum, Christopher G.; Barry, Kerrie W.; Grigoriev, Igor V.; Smith, Kristina M.; Galagan, James E.; Bell-Pedersen, Deborah; Freitag, Michael; Cheng, Chao; Loros, Jennifer J.; Dunlap, Jay C.

    2014-01-01

    Neurospora crassa has been for decades a principal model for filamentous fungal genetics and physiology as well as for understanding the mechanism of circadian clocks. Eukaryotic fungal and animal clocks comprise transcription-translation–based feedback loops that control rhythmic transcription of a substantial fraction of these transcriptomes, yielding the changes in protein abundance that mediate circadian regulation of physiology and metabolism: Understanding circadian control of gene expression is key to understanding eukaryotic, including fungal, physiology. Indeed, the isolation of clock-controlled genes (ccgs) was pioneered in Neurospora where circadian output begins with binding of the core circadian transcription factor WCC to a subset of ccg promoters, including those of many transcription factors. High temporal resolution (2-h) sampling over 48 h using RNA sequencing (RNA-Seq) identified circadianly expressed genes in Neurospora, revealing that from ∼10% to as much 40% of the transcriptome can be expressed under circadian control. Functional classifications of these genes revealed strong enrichment in pathways involving metabolism, protein synthesis, and stress responses; in broad terms, daytime metabolic potential favors catabolism, energy production, and precursor assembly, whereas night activities favor biosynthesis of cellular components and growth. Discriminative regular expression motif elicitation (DREME) identified key promoter motifs highly correlated with the temporal regulation of ccgs. Correlations between ccg abundance from RNA-Seq, the degree of ccg-promoter activation as reported by ccg-promoter–luciferase fusions, and binding of WCC as measured by ChIP-Seq, are not strong. Therefore, although circadian activation is critical to ccg rhythmicity, posttranscriptional regulation plays a major role in determining rhythmicity at the mRNA level. PMID:25362047

  4. Ontology Based Model Transformation Infrastructure

    NARCIS (Netherlands)

    Göknil, Arda; Topaloglu, N.Y.

    2005-01-01

    Using MDA in ontology development has been investigated in several works recently. The mappings and transformations between the UML constructs and the OWL elements to develop ontologies are the main concern of these research projects. We propose another approach in order to achieve the collaboration

  5. Ontology through a Mindfulness Process

    Science.gov (United States)

    Bearance, Deborah; Holmes, Kimberley

    2015-01-01

    Traditionally, when ontology is taught in a graduate studies course on social research, there is a tendency for this concept to be examined through the process of lectures and readings. Such an approach often leaves graduate students to grapple with a personal embodiment of this concept and to comprehend how ontology can ground their research.…

  6. The foundational ontology library ROMULUS

    CSIR Research Space (South Africa)

    Khan, ZC

    2013-09-01

    Full Text Available . We present here a basic step in that direction with the Repository of Ontologies for MULtiple USes, ROMULUS, which is the first online library of machine-processable, modularised, aligned, and logic-based merged foundational ontologies. In addition...

  7. Tracking Changes during Ontology Evolution

    NARCIS (Netherlands)

    Noy, Natalya F.; Kunnatur, Sandhya; Klein, Michel; Musen, Mark A.

    2004-01-01

    As ontology development becomes a collaborative process, developers face the problem of maintaining versions of ontologies akin to maintaining versions of software code or versions of documents in large projects. Traditional versioning systems enable users to compare versions, examine changes, and

  8. Gene expression analysis after receptor tyrosine kinase activation reveals new potential melanoma proteins

    International Nuclear Information System (INIS)

    Teutschbein, Janka; Haydn, Johannes M; Samans, Birgit; Krause, Michael; Eilers, Martin; Schartl, Manfred; Meierjohann, Svenja

    2010-01-01

    Melanoma is an aggressive tumor with increasing incidence. To develop accurate prognostic markers and targeted therapies, changes leading to malignant transformation of melanocytes need to be understood. In the Xiphophorus melanoma model system, a mutated version of the EGF receptor Xmrk (Xiphophorus melanoma receptor kinase) triggers melanomagenesis. Cellular events downstream of Xmrk, such as the activation of Akt, Ras, B-Raf or Stat5, were also shown to play a role in human melanomagenesis. This makes the elucidation of Xmrk downstream targets a useful method for identifying processes involved in melanoma formation. Here, we analyzed Xmrk-induced gene expression using a microarray approach. Several highly expressed genes were confirmed by realtime PCR, and pathways responsible for their induction were revealed using small molecule inhibitors. The expression of these genes was also monitored in human melanoma cell lines, and the target gene FOSL1 was knocked down by siRNA. Proliferation and migration of siRNA-treated melanoma cell lines were then investigated. Genes with the strongest upregulation after receptor activation were FOS-like antigen 1 (Fosl1), early growth response 1 (Egr1), osteopontin (Opn), insulin-like growth factor binding protein 3 (Igfbp3), dual-specificity phosphatase 4 (Dusp4), and tumor-associated antigen L6 (Taal6). Interestingly, most genes were blocked in presence of a SRC kinase inhibitor. Importantly, we found that FOSL1, OPN, IGFBP3, DUSP4, and TAAL6 also exhibited increased expression levels in human melanoma cell lines compared to human melanocytes. Knockdown of FOSL1 in human melanoma cell lines reduced their proliferation and migration. Altogether, the data show that the receptor tyrosine kinase Xmrk is a useful tool in the identification of target genes that are commonly expressed in Xmrk-transgenic melanocytes and melanoma cell lines. The identified molecules constitute new possible molecular players in melanoma development

  9. Gene expression analysis after receptor tyrosine kinase activation reveals new potential melanoma proteins

    Directory of Open Access Journals (Sweden)

    Krause Michael

    2010-07-01

    Full Text Available Abstract Background Melanoma is an aggressive tumor with increasing incidence. To develop accurate prognostic markers and targeted therapies, changes leading to malignant transformation of melanocytes need to be understood. In the Xiphophorus melanoma model system, a mutated version of the EGF receptor Xmrk (Xiphophorus melanoma receptor kinase triggers melanomagenesis. Cellular events downstream of Xmrk, such as the activation of Akt, Ras, B-Raf or Stat5, were also shown to play a role in human melanomagenesis. This makes the elucidation of Xmrk downstream targets a useful method for identifying processes involved in melanoma formation. Methods Here, we analyzed Xmrk-induced gene expression using a microarray approach. Several highly expressed genes were confirmed by realtime PCR, and pathways responsible for their induction were revealed using small molecule inhibitors. The expression of these genes was also monitored in human melanoma cell lines, and the target gene FOSL1 was knocked down by siRNA. Proliferation and migration of siRNA-treated melanoma cell lines were then investigated. Results Genes with the strongest upregulation after receptor activation were FOS-like antigen 1 (Fosl1, early growth response 1 (Egr1, osteopontin (Opn, insulin-like growth factor binding protein 3 (Igfbp3, dual-specificity phosphatase 4 (Dusp4, and tumor-associated antigen L6 (Taal6. Interestingly, most genes were blocked in presence of a SRC kinase inhibitor. Importantly, we found that FOSL1, OPN, IGFBP3, DUSP4, and TAAL6 also exhibited increased expression levels in human melanoma cell lines compared to human melanocytes. Knockdown of FOSL1 in human melanoma cell lines reduced their proliferation and migration. Conclusion Altogether, the data show that the receptor tyrosine kinase Xmrk is a useful tool in the identification of target genes that are commonly expressed in Xmrk-transgenic melanocytes and melanoma cell lines. The identified molecules constitute

  10. PDON: Parkinson's disease ontology for representation and modeling of the Parkinson's disease knowledge domain.

    Science.gov (United States)

    Younesi, Erfan; Malhotra, Ashutosh; Gündel, Michaela; Scordis, Phil; Kodamullil, Alpha Tom; Page, Matt; Müller, Bernd; Springstubbe, Stephan; Wüllner, Ullrich; Scheller, Dieter; Hofmann-Apitius, Martin

    2015-09-22

    Despite the unprecedented and increasing amount of data, relatively little progress has been made in molecular characterization of mechanisms underlying Parkinson's disease. In the area of Parkinson's research, there is a pressing need to integrate various pieces of information into a meaningful context of presumed disease mechanism(s). Disease ontologies provide a novel means for organizing, integrating, and standardizing the knowledge domains specific to disease in a compact, formalized and computer-readable form and serve as a reference for knowledge exchange or systems modeling of disease mechanism. The Parkinson's disease ontology was built according to the life cycle of ontology building. Structural, functional, and expert evaluation of the ontology was performed to ensure the quality and usability of the ontology. A novelty metric has been introduced to measure the gain of new knowledge using the ontology. Finally, a cause-and-effect model was built around PINK1 and two gene expression studies from the Gene Expression Omnibus database were re-annotated to demonstrate the usability of the ontology. The Parkinson's disease ontology with a subclass-based taxonomic hierarchy covers the broad spectrum of major biomedical concepts from molecular to clinical features of the disease, and also reflects different views on disease features held by molecular biologists, clinicians and drug developers. The current version of the ontology contains 632 concepts, which are organized under nine views. The structural evaluation showed the balanced dispersion of concept classes throughout the ontology. The functional evaluation demonstrated that the ontology-driven literature search could gain novel knowledge not present in the reference Parkinson's knowledge map. The ontology was able to answer specific questions related to Parkinson's when evaluated by experts. Finally, the added value of the Parkinson's disease ontology is demonstrated by ontology-driven modeling of PINK1

  11. Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.

    Science.gov (United States)

    Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M

    2017-01-01

    Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms

  12. Network analysis of ChIP-Seq data reveals key genes in prostate cancer.

    Science.gov (United States)

    Zhang, Yu; Huang, Zhen; Zhu, Zhiqiang; Liu, Jianwei; Zheng, Xin; Zhang, Yuhai

    2014-09-03

    Prostate cancer (PC) is the second most common cancer among men in the United States, and it imposes a considerable threat to human health. A deep understanding of its underlying molecular mechanisms is the premise for developing effective targeted therapies. Recently, deep transcriptional sequencing has been used as an effective genomic assay to obtain insights into diseases and may be helpful in the study of PC. In present study, ChIP-Seq data for PC and normal samples were compared, and differential peaks identified, based upon fold changes (with P-values calculated with t-tests). Annotations of these peaks were performed. Protein-protein interaction (PPI) network analysis was performed with BioGRID and constructed with Cytoscape, following which the highly connected genes were screened. We obtained a total of 5,570 differential peaks, including 3,726 differentially enriched peaks in tumor samples and 1,844 differentially enriched peaks in normal samples. There were eight significant regions of the peaks. The intergenic region possessed the highest score (51%), followed by intronic (31%) and exonic (11%) regions. The analysis revealed the top 35 highly connected genes, which comprised 33 differential genes (such as YWHAQ, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein and θ polypeptide) from ChIP-Seq data and 2 differential genes retrieved from the PPI network: UBA52 (ubiquitin A-52 residue ribosomal protein fusion product (1) and SUMO2 (SMT3 suppressor of mif two 3 homolog (2) . Our findings regarding potential PC-related genes increase the understanding of PC and provides direction for future research.

  13. Proteomic Analyses Reveal the Mechanism of Dunaliella salina Ds-26-16 Gene Enhancing Salt Tolerance in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Yanlong Wang

    Full Text Available We previously screened the novel gene Ds-26-16 from a 4 M salt-stressed Dunaliella salina cDNA library and discovered that this gene conferred salt tolerance to broad-spectrum organisms, including E. coli (Escherichia coli, Haematococcus pluvialis and tobacco. To determine the mechanism of this gene conferring salt tolerance, we studied the proteome of E. coli overexpressing the full-length cDNA of Ds-26-16 using the iTRAQ (isobaric tags for relative and absolute quantification approach. A total of 1,610 proteins were identified, which comprised 39.4% of the whole proteome. Of the 559 differential proteins, 259 were up-regulated and 300 were down-regulated. GO (gene ontology and KEGG (Kyoto encyclopedia of genes and genomes enrichment analyses identified 202 major proteins, including those involved in amino acid and organic acid metabolism, energy metabolism, carbon metabolism, ROS (reactive oxygen species scavenging, membrane proteins and ABC (ATP binding cassette transporters, and peptidoglycan synthesis, as well as 5 up-regulated transcription factors. Our iTRAQ data suggest that Ds-26-16 up-regulates the transcription factors in E. coli to enhance salt resistance through osmotic balance, energy metabolism, and oxidative stress protection. Changes in the proteome were also observed in E. coli overexpressing the ORF (open reading frame of Ds-26-16. Furthermore, pH, nitric oxide and glycerol content analyses indicated that Ds-26-16 overexpression increases nitric oxide content but has no effect on glycerol content, thus confirming that enhanced nitric oxide synthesis via lower intercellular pH was one of the mechanisms by which Ds-26-16 confers salt tolerance to E. coli.

  14. Identification of protein features encoded by alternative exons using Exon Ontology.

    Science.gov (United States)

    Tranchevent, Léon-Charles; Aubé, Fabien; Dulaurier, Louis; Benoit-Pilven, Clara; Rey, Amandine; Poret, Arnaud; Chautard, Emilie; Mortada, Hussein; Desmet, François-Olivier; Chakrama, Fatima Zahra; Moreno-Garcia, Maira Alejandra; Goillot, Evelyne; Janczarski, Stéphane; Mortreux, Franck; Bourgeois, Cyril F; Auboeuf, Didier

    2017-06-01

    Transcriptomic genome-wide analyses demonstrate massive variation of alternative splicing in many physiological and pathological situations. One major challenge is now to establish the biological contribution of alternative splicing variation in physiological- or pathological-associated cellular phenotypes. Toward this end, we developed a computational approach, named "Exon Ontology," based on terms corresponding to well-characterized protein features organized in an ontology tree. Exon Ontology is conceptually similar to Gene Ontology-based approaches but focuses on exon-encoded protein features instead of gene level functional annotations. Exon Ontology describes the protein features encoded by a selected list of exons and looks for potential Exon Ontology term enrichment. By applying this strategy to exons that are differentially spliced between epithelial and mesenchymal cells and after extensive experimental validation, we demonstrate that Exon Ontology provides support to discover specific protein features regulated by alternative splicing. We also show that Exon Ontology helps to unravel biological processes that depend on suites of coregulated alternative exons, as we uncovered a role of epithelial cell-enriched splicing factors in the AKT signaling pathway and of mesenchymal cell-enriched splicing factors in driving splicing events impacting on autophagy. Freely available on the web, Exon Ontology is the first computational resource that allows getting a quick insight into the protein features encoded by alternative exons and investigating whether coregulated exons contain the same biological information. © 2017 Tranchevent et al.; Published by Cold Spring Harbor Laboratory Press.

  15. Ontogeny of hepatic energy metabolism genes in mice as revealed by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Helen J Renaud

    Full Text Available The liver plays a central role in metabolic homeostasis by coordinating synthesis, storage, breakdown, and redistribution of nutrients. Hepatic energy metabolism is dynamically regulated throughout different life stages due to different demands for energy during growth and development. However, changes in gene expression patterns throughout ontogeny for factors important in hepatic energy metabolism are not well understood. We performed detailed transcript analysis of energy metabolism genes during various stages of liver development in mice. Livers from male C57BL/6J mice were collected at twelve ages, including perinatal and postnatal time points (n = 3/age. The mRNA was quantified by RNA-Sequencing, with transcript abundance estimated by Cufflinks. One thousand sixty energy metabolism genes were examined; 794 were above detection, of which 627 were significantly changed during at least one developmental age compared to adult liver. Two-way hierarchical clustering revealed three major clusters dependent on age: GD17.5-Day 5 (perinatal-enriched, Day 10-Day 20 (pre-weaning-enriched, and Day 25-Day 60 (adolescence/adulthood-enriched. Clustering analysis of cumulative mRNA expression values for individual pathways of energy metabolism revealed three patterns of enrichment: glycolysis, ketogenesis, and glycogenesis were all perinatally-enriched; glycogenolysis was the only pathway enriched during pre-weaning ages; whereas lipid droplet metabolism, cholesterol and bile acid metabolism, gluconeogenesis, and lipid metabolism were all enriched in adolescence/adulthood. This study reveals novel findings such as the divergent expression of the fatty acid β-oxidation enzymes Acyl-CoA oxidase 1 and Carnitine palmitoyltransferase 1a, indicating a switch from mitochondrial to peroxisomal β-oxidation after weaning; as well as the dynamic ontogeny of genes implicated in obesity such as Stearoyl-CoA desaturase 1 and Elongation of very long chain fatty

  16. Comprehensive regional and temporal gene expression profiling of the rat brain during the first 24 h after experimental stroke identifies dynamic ischemia-induced gene expression patterns, and reveals a biphasic activation of genes in surviving tissue

    DEFF Research Database (Denmark)

    Rickhag, Karl Mattias; Wieloch, Tadeusz; Gidö, Gunilla

    2006-01-01

    middle cerebral artery occlusion in the rat. K-means cluster analysis revealed two distinct biphasic gene expression patterns that contained 44 genes (including 18 immediate early genes), involved in cell signaling and plasticity (i.e. MAP2K7, Sprouty2, Irs-2, Homer1, GPRC5B, Grasp). The first gene...

  17. Spatiotemporal network motif reveals the biological traits of developmental gene regulatory networks in Drosophila melanogaster

    Directory of Open Access Journals (Sweden)

    Kim Man-Sun

    2012-05-01

    Full Text Available Abstract Background Network motifs provided a “conceptual tool” for understanding the functional principles of biological networks, but such motifs have primarily been used to consider static network structures. Static networks, however, cannot be used to reveal time- and region-specific traits of biological systems. To overcome this limitation, we proposed the concept of a “spatiotemporal network motif,” a spatiotemporal sequence of network motifs of sub-networks which are active only at specific time points and body parts. Results On the basis of this concept, we analyzed the developmental gene regulatory network of the Drosophila melanogaster embryo. We identified spatiotemporal network motifs and investigated their distribution pattern in time and space. As a result, we found how key developmental processes are temporally and spatially regulated by the gene network. In particular, we found that nested feedback loops appeared frequently throughout the entire developmental process. From mathematical simulations, we found that mutual inhibition in the nested feedback loops contributes to the formation of spatial expression patterns. Conclusions Taken together, the proposed concept and the simulations can be used to unravel the design principle of developmental gene regulatory networks.

  18. Systems Nutrigenomics Reveals Brain Gene Networks Linking Metabolic and Brain Disorders.

    Science.gov (United States)

    Meng, Qingying; Ying, Zhe; Noble, Emily; Zhao, Yuqi; Agrawal, Rahul; Mikhail, Andrew; Zhuang, Yumei; Tyagi, Ethika; Zhang, Qing; Lee, Jae-Hyung; Morselli, Marco; Orozco, Luz; Guo, Weilong; Kilts, Tina M; Zhu, Jun; Zhang, Bin; Pellegrini, Matteo; Xiao, Xinshu; Young, Marian F; Gomez-Pinilla, Fernando; Yang, Xia

    2016-05-01

    Nutrition plays a significant role in the increasing prevalence of metabolic and brain disorders. Here we employ systems nutrigenomics to scrutinize the genomic bases of nutrient-host interaction underlying disease predisposition or therapeutic potential. We conducted transcriptome and epigenome sequencing of hypothalamus (metabolic control) and hippocampus (cognitive processing) from a rodent model of fructose consumption, and identified significant reprogramming of DNA methylation, transcript abundance, alternative splicing, and gene networks governing cell metabolism, cell communication, inflammation, and neuronal signaling. These signals converged with genetic causal risks of metabolic, neurological, and psychiatric disorders revealed in humans. Gene network modeling uncovered the extracellular matrix genes Bgn and Fmod as main orchestrators of the effects of fructose, as validated using two knockout mouse models. We further demonstrate that an omega-3 fatty acid, DHA, reverses the genomic and network perturbations elicited by fructose, providing molecular support for nutritional interventions to counteract diet-induced metabolic and brain disorders. Our integrative approach complementing rodent and human studies supports the applicability of nutrigenomics principles to predict disease susceptibility and to guide personalized medicine. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Logic and Ontology

    Directory of Open Access Journals (Sweden)

    Newton C. A. da Costa

    2002-12-01

    Full Text Available In view of the present state of development of non classical logic, especially of paraconsistent logic, a new stand regarding the relations between logic and ontology is defended In a parody of a dictum of Quine, my stand May be summarized as follows. To be is to be the value of a variable a specific language with a given underlying logic Yet my stand differs from Quine’s, because, among other reasons, I accept some first order heterodox logics as genuine alternatives to classical logic I also discuss some questions of non classical logic to substantiate my argument, and suggest that may position complements and extends some ideas advanced by L Apostel.

  20. Cell-type independent MYC target genes reveal a primordial signature involved in biomass accumulation.

    Directory of Open Access Journals (Sweden)

    Hongkai Ji

    Full Text Available The functions of key oncogenic transcription factors independent of context have not been fully delineated despite our richer understanding of the genetic alterations in human cancers. The MYC oncogene, which produces the Myc transcription factor, is frequently altered in human cancer and is a major regulatory hub for many cancers. In this regard, we sought to unravel the primordial signature of Myc function by using high-throughput genomic approaches to identify the cell-type independent core Myc target gene signature. Using a model of human B lymphoma cells bearing inducible MYC, we identified a stringent set of direct Myc target genes via chromatin immunoprecipitation (ChIP, global nuclear run-on assay, and changes in mRNA levels. We also identified direct Myc targets in human embryonic stem cells (ESCs. We further document that a Myc core signature (MCS set of target genes is shared in mouse and human ESCs as well as in four other human cancer cell types. Remarkably, the expression of the MCS correlates with MYC expression in a cell-type independent manner across 8,129 microarray samples, which include 312 cell and tissue types. Furthermore, the expression of the MCS is elevated in vivo in Eμ-Myc transgenic murine lymphoma cells as compared with premalignant or normal B lymphocytes. Expression of the MCS in human B cell lymphomas, acute leukemia, lung cancers or Ewing sarcomas has the highest correlation with MYC expression. Annotation of this gene signature reveals Myc's primordial function in RNA processing, ribosome biogenesis and biomass accumulation as its key roles in cancer and stem cells.

  1. Dynamic compression of chondrocyte-agarose constructs reveals new candidate mechanosensitive genes.

    Directory of Open Access Journals (Sweden)

    Carole Bougault

    Full Text Available Articular cartilage is physiologically exposed to repeated loads. The mechanical properties of cartilage are due to its extracellular matrix, and homeostasis is maintained by the sole cell type found in cartilage, the chondrocyte. Although mechanical forces clearly control the functions of articular chondrocytes, the biochemical pathways that mediate cellular responses to mechanical stress have not been fully characterised. The aim of our study was to examine early molecular events triggered by dynamic compression in chondrocytes. We used an experimental system consisting of primary mouse chondrocytes embedded within an agarose hydrogel; embedded cells were pre-cultured for one week and subjected to short-term compression experiments. Using Western blots, we demonstrated that chondrocytes maintain a differentiated phenotype in this model system and reproduce typical chondrocyte-cartilage matrix interactions. We investigated the impact of dynamic compression on the phosphorylation state of signalling molecules and genome-wide gene expression. After 15 min of dynamic compression, we observed transient activation of ERK1/2 and p38 (members of the mitogen-activated protein kinase (MAPK pathways and Smad2/3 (members of the canonical transforming growth factor (TGF-β pathways. A microarray analysis performed on chondrocytes compressed for 30 min revealed that only 20 transcripts were modulated more than 2-fold. A less conservative list of 325 modulated genes included genes related to the MAPK and TGF-β pathways and/or known to be mechanosensitive in other biological contexts. Of these candidate mechanosensitive genes, 85% were down-regulated. Down-regulation may therefore represent a general control mechanism for a rapid response to dynamic compression. Furthermore, modulation of transcripts corresponding to different aspects of cellular physiology was observed, such as non-coding RNAs or primary cilium. This study provides new insight into how

  2. Building a developmental toxicity ontology.

    Science.gov (United States)

    Baker, Nancy; Boobis, Alan; Burgoon, Lyle; Carney, Edward; Currie, Richard; Fritsche, Ellen; Knudsen, Thomas; Laffont, Madeleine; Piersma, Aldert H; Poole, Alan; Schneider, Steffen; Daston, George

    2018-04-03

    As more information is generated about modes of action for developmental toxicity and more data are generated using high-throughput and high-content technologies, it is becoming necessary to organize that information. This report discussed the need for a systematic representation of knowledge about developmental toxicity (i.e., an ontology) and proposes a method to build one based on knowledge of developmental biology and mode of action/ adverse outcome pathways in developmental toxicity. This report is the result of a consensus working group developing a plan to create an ontology for developmental toxicity that spans multiple levels of biological organization. This report provide a description of some of the challenges in building a developmental toxicity ontology and outlines a proposed methodology to meet those challenges. As the ontology is built on currently available web-based resources, a review of these resources is provided. Case studies on one of the most well-understood morphogens and developmental toxicants, retinoic acid, are presented as examples of how such an ontology might be developed. This report outlines an approach to construct a developmental toxicity ontology. Such an ontology will facilitate computer-based prediction of substances likely to induce human developmental toxicity. © 2018 Wiley Periodicals, Inc.

  3. Comprehensive transcriptional profiling of NaCl-stressed Arabidopsis roots reveals novel classes of responsive genes

    Directory of Open Access Journals (Sweden)

    Deyholos Michael K

    2006-10-01

    Full Text Available Abstract Background Roots are an attractive system for genomic and post-genomic studies of NaCl responses, due to their primary importance to agriculture, and because of their relative structural and biochemical simplicity. Excellent genomic resources have been established for the study of Arabidopsis roots, however, a comprehensive microarray analysis of the root transcriptome following NaCl exposure is required to further understand plant responses to abiotic stress and facilitate future, systems-based analyses of the underlying regulatory networks. Results We used microarrays of 70-mer oligonucleotide probes representing 23,686 Arabidopsis genes to identify root transcripts that changed in relative abundance following 6 h, 24 h, or 48 h of hydroponic exposure to 150 mM NaCl. Enrichment analysis identified groups of structurally or functionally related genes whose members were statistically over-represented among up- or down-regulated transcripts. Our results are consistent with generally observed stress response themes, and highlight potentially important roles for underappreciated gene families, including: several groups of transporters (e.g. MATE, LeOPT1-like; signalling molecules (e.g. PERK kinases, MLO-like receptors, carbohydrate active enzymes (e.g. XTH18, transcription factors (e.g. members of ZIM, WRKY, NAC, and other proteins (e.g. 4CL-like, COMT-like, LOB-Class 1. We verified the NaCl-inducible expression of selected transcription factors and other genes by qRT-PCR. Conclusion Micorarray profiling of NaCl-treated Arabidopsis roots revealed dynamic changes in transcript abundance for at least 20% of the genome, including hundreds of transcription factors, kinases/phosphatases, hormone-related genes, and effectors of homeostasis, all of which highlight the complexity of this stress response. Our identification of these transcriptional responses, and groups of evolutionarily related genes with either similar or divergent

  4. Transcriptomic analyses reveal novel genes with sexually dimorphic expression in the zebrafish gonad and brain.

    Directory of Open Access Journals (Sweden)

    Rajini Sreenivasan

    Full Text Available BACKGROUND: Our knowledge on zebrafish reproduction is very limited. We generated a gonad-derived cDNA microarray from zebrafish and used it to analyze large-scale gene expression profiles in adult gonads and other organs. METHODOLOGY/PRINCIPAL FINDINGS: We have identified 116638 gonad-derived zebrafish expressed sequence tags (ESTs, 21% of which were isolated in our lab. Following in silico normalization, we constructed a gonad-derived microarray comprising 6370 unique, full-length cDNAs from differentiating and adult gonads. Labeled targets from adult gonad, brain, kidney and 'rest-of-body' from both sexes were hybridized onto the microarray. Our analyses revealed 1366, 881 and 656 differentially expressed transcripts (34.7% novel that showed highest expression in ovary, testis and both gonads respectively. Hierarchical clustering showed correlation of the two gonadal transcriptomes and their similarities to those of the brains. In addition, we have identified 276 genes showing sexually dimorphic expression both between the brains and between the gonads. By in situ hybridization, we showed that the gonadal transcripts with the strongest array signal intensities were germline-expressed. We found that five members of the GTP-binding septin gene family, from which only one member (septin 4 has previously been implicated in reproduction in mice, were all strongly expressed in the gonads. CONCLUSIONS/SIGNIFICANCE: We have generated a gonad-derived zebrafish cDNA microarray and demonstrated its usefulness in identifying genes with sexually dimorphic co-expression in both the gonads and the brains. We have also provided the first evidence of large-scale differential gene expression between female and male brains of a teleost. Our microarray would be useful for studying gonad development, differentiation and function not only in zebrafish but also in related teleosts via cross-species hybridizations. Since several genes have been shown to play similar

  5. Polyploid genome of Camelina sativa revealed by isolation of fatty acid synthesis genes

    Directory of Open Access Journals (Sweden)

    Shewmaker Christine K

    2010-10-01

    Full Text Available Abstract Background Camelina sativa, an oilseed crop in the Brassicaceae family, has inspired renewed interest due to its potential for biofuels applications. Little is understood of the nature of the C. sativa genome, however. A study was undertaken to characterize two genes in the fatty acid biosynthesis pathway, fatty acid desaturase (FAD 2 and fatty acid elongase (FAE 1, which revealed unexpected complexity in the C. sativa genome. Results In C. sativa, Southern analysis indicates the presence of three copies of both FAD2 and FAE1 as well as LFY, a known single copy gene in other species. All three copies of both CsFAD2 and CsFAE1 are expressed in developing seeds, and sequence alignments show that previously described conserved sites are present, suggesting that all three copies of both genes could be functional. The regions downstream of CsFAD2 and upstream of CsFAE1 demonstrate co-linearity with the Arabidopsis genome. In addition, three expressed haplotypes were observed for six predicted single-copy genes in 454 sequencing analysis and results from flow cytometry indicate that the DNA content of C. sativa is approximately three-fold that of diploid Camelina relatives. Phylogenetic analyses further support a history of duplication and indicate that C. sativa and C. microcarpa might share a parental genome. Conclusions There is compelling evidence for triplication of the C. sativa genome, including a larger chromosome number and three-fold larger measured genome size than other Camelina relatives, three isolated copies of FAD2, FAE1, and the KCS17-FAE1 intergenic region, and three expressed haplotypes observed for six predicted single-copy genes. Based on these results, we propose that C. sativa be considered an allohexaploid. The characterization of fatty acid synthesis pathway genes will allow for the future manipulation of oil composition of this emerging biofuel crop; however, targeted manipulations of oil composition and general

  6. Mutations in THAP1/DYT6 reveal that diverse dystonia genes disrupt similar neuronal pathways and functions.

    Directory of Open Access Journals (Sweden)

    Zuchra Zakirova

    2018-01-01

    Full Text Available Dystonia is characterized by involuntary muscle contractions. Its many forms are genetically, phenotypically and etiologically diverse and it is unknown whether their pathogenesis converges on shared pathways. Mutations in THAP1 [THAP (Thanatos-associated protein domain containing, apoptosis associated protein 1], a ubiquitously expressed transcription factor with DNA binding and protein-interaction domains, cause dystonia, DYT6. There is a unique, neuronal 50-kDa Thap1-like immunoreactive species, and Thap1 levels are auto-regulated on the mRNA level. However, THAP1 downstream targets in neurons, and the mechanism via which it causes dystonia are largely unknown. We used RNA-Seq to assay the in vivo effect of a heterozygote Thap1 C54Y or ΔExon2 allele on the gene transcription signatures in neonatal mouse striatum and cerebellum. Enriched pathways and gene ontology terms include eIF2α Signaling, Mitochondrial Dysfunction, Neuron Projection Development, Axonal Guidance Signaling, and Synaptic LongTerm Depression, which are dysregulated in a genotype and tissue-dependent manner. Electrophysiological and neurite outgrowth assays were consistent with those enrichments, and the plasticity defects were partially corrected by salubrinal. Notably, several of these pathways were recently implicated in other forms of inherited dystonia, including DYT1. We conclude that dysfunction of these pathways may represent a point of convergence in the pathophysiology of several forms of inherited dystonia.

  7. There is no quantum ontology without classical ontology

    Energy Technology Data Exchange (ETDEWEB)

    Fink, Helmut [Institut fuer Theoretische Physik, Univ. Erlangen-Nuernberg (Germany)

    2011-07-01

    The relation between quantum physics and classical physics is still under debate. In his recent book ''Rational Reconstructions of Modern Physics'', Peter Mittelstaedt explores a route from classical to quantum mechanics by reduction and elimination of (some of) the ontological hypotheses underlying classical mechanics. While, according to Mittelstaedt, classical mechanics describes a fictitious world that does not exist in reality, he claims to achieve a universal quantum ontology that can be improved by incorporating unsharp properties and equipped with Planck's constant without any need to refer to classical concepts. In this talk, we argue that quantum ontology in Mittelstaedt's sense is not enough. Quantum ontology can never be universal as long as the difference between potential and real properties is not represented adequately. Quantum properties are potential, not (yet) real, be they sharp or unsharp. Hence, preparation and measurement presuppose classical concepts, even in quantum theory. We end up with a classical-quantum sandwich ontology, which is still less extravagant than Bohmian or many-worlds ontologies are.

  8. Large scale gene expression meta-analysis reveals tissue-specific, sex-biased gene expression in humans

    Directory of Open Access Journals (Sweden)

    Benjamin Mayne

    2016-10-01

    Full Text Available The severity and prevalence of many diseases are known to differ between the sexes. Organ specific sex-biased gene expression may underpin these and other sexually dimorphic traits. To further our understanding of sex differences in transcriptional regulation, we performed meta-analyses of sex biased gene expression in multiple human tissues. We analysed 22 publicly available human gene expression microarray data sets including over 2500 samples from 15 different tissues and 9 different organs. Briefly, by using an inverse-variance method we determined the effect size difference of gene expression between males and females. We found the greatest sex differences in gene expression in the brain, specifically in the anterior cingulate cortex, (1818 genes, followed by the heart (375 genes, kidney (224 genes, colon (218 genes and thyroid (163 genes. More interestingly, we found different parts of the brain with varying numbers and identity of sex-biased genes, indicating that specific cortical regions may influence sexually dimorphic traits. The majority of sex-biased genes in other tissues such as the bladder, liver, lungs and pancreas were on the sex chromosomes or involved in sex hormone production. On average in each tissue, 32% of autosomal genes that were expressed in a sex-biased fashion contained androgen or estrogen hormone response elements. Interestingly, across all tissues, we found approximately two-thirds of autosomal genes that were sex-biased were not under direct influence of sex hormones. To our knowledge this is the largest analysis of sex-biased gene expression in human tissues to date. We identified many sex-biased genes that were not under the direct influence of sex chromosome genes or sex hormones. These may provide targets for future development of sex-specific treatments for diseases.

  9. development of ontological knowledge representation

    African Journals Online (AJOL)

    Preferred Customer

    ABSTRACT. This paper presents the development of an ontological knowledge organization and .... intelligence in order to facilitate knowledge sharing and reuse of acquired knowledge (15). Soon, ..... Water Chemistry, AJCE, 1(2), 50-58. 25.

  10. A Mobile Army of Ontologies

    DEFF Research Database (Denmark)

    Juul, Jesper

    2015-01-01

    Presentation at the Ludo-ontologies panel. Do we need ludo-ontologies, and what are they? In this event several scholars of games and videogames discuss these questions from a variety of perspectives. What different game and videogame ontologies exist and could exist, and why they are important...... for game and videogame research? The round table is designed to promote ludo-ontological dialogue in order to make these questions visible and debated. A series of short presentations (approximately 10 minutes each) will be followed by an intense debate through freeform dialogue. After the industrial...... commercialization of games and videogames their study has shifted between approaches focused on players (ludic processes) and artifacts (ludic objects). Some attempts to analyze the relationship between the process and the object have occasionally been done in terms of ‘ontology’ (Zagal 2005; Leino 2010; Gualeni...

  11. NDH expression marks major transitions in plant evolution and reveals coordinate intracellular gene loss.

    Science.gov (United States)

    Ruhlman, Tracey A; Chang, Wan-Jung; Chen, Jeremy J W; Huang, Yao-Ting; Chan, Ming-Tsair; Zhang, Jin; Liao, De-Chih; Blazier, John C; Jin, Xiaohua; Shih, Ming-Che; Jansen, Robert K; Lin, Choun-Sea

    2015-04-11

    Key innovations have facilitated novel niche utilization, such as the movement of the algal predecessors of land plants into terrestrial habitats where drastic fluctuations in light intensity, ultraviolet radiation and water limitation required a number of adaptations. The NDH (NADH dehydrogenase-like) complex of Viridiplantae plastids participates in adapting the photosynthetic response to environmental stress, suggesting its involvement in the transition to terrestrial habitats. Although relatively rare, the loss or pseudogenization of plastid NDH genes is widely distributed across diverse lineages of photoautotrophic seed plants and mutants/transgenics lacking NDH function demonstrate little difference from wild type under non-stressed conditions. This study analyzes large transcriptomic and genomic datasets to evaluate the persistence and loss of NDH expression across plants. Nuclear expression profiles showed accretion of the NDH gene complement at key transitions in land plant evolution, such as the transition to land and at the base of the angiosperm lineage. While detection of transcripts for a selection of non-NDH, photosynthesis related proteins was independent of the state of NDH, coordinate, lineage-specific loss of plastid NDH genes and expression of nuclear-encoded NDH subunits was documented in Pinaceae, gnetophytes, Orchidaceae and Geraniales confirming the independent and complete loss of NDH in these diverse seed plant taxa. The broad phylogenetic distribution of NDH loss and the subtle phenotypes of mutants suggest that the NDH complex is of limited biological significance in contemporary plants. While NDH activity appears dispensable under favorable conditions, there were likely sufficiently frequent episodes of abiotic stress affecting terrestrial habitats to allow the retention of NDH activity. These findings reveal genetic factors influencing plant/environment interactions in a changing climate through 450 million years of land plant

  12. Digital karyotyping reveals probable target genes at 7q21.3 locus in hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Wang Shengyue

    2011-07-01

    Full Text Available Abstract Background Hepatocellular carcinoma (HCC is a worldwide malignant liver tumor with high incidence in China. Subchromosomal amplifications and deletions accounted for major genomic alterations occurred in HCC. Digital karyotyping was an effective method for analyzing genome-wide chromosomal aberrations at high resolution. Methods A digital karyotyping library of HCC was constructed and 454 Genome Sequencer FLX System (Roche was applied in large scale sequencing of the library. Digital Karyotyping Data Viewer software was used to analyze genomic amplifications and deletions. Genomic amplifications of genes detected by digital karyotyping were examined by real-time quantitative PCR. The mRNA expression level of these genes in tumorous and paired nontumorous tissues was also detected by real-time quantitative RT-PCR. Results A total of 821,252 genomic tags were obtained from the digital karyotyping library of HCC, with 529,162 tags (64% mapped to unique loci of human genome. Multiple subchromosomal amplifications and deletions were detected through analyzing the digital karyotyping data, among which the amplification of 7q21.3 drew our special attention. Validation of genes harbored within amplicons at 7q21.3 locus revealed that genomic amplification of SGCE, PEG10, DYNC1I1 and SLC25A13 occurred in 11 (21%, 11 (21%, 11 (21% and 23 (44% of the 52 HCC samples respectively. Furthermore, the mRNA expression level of SGCE, PEG10 and DYNC1I1 were significantly up-regulated in tumorous liver tissues compared with corresponding nontumorous counterparts. Conclusions Our results indicated that subchromosomal region of 7q21.3 was amplified in HCC, and SGCE, PEG10 and DYNC1I1 were probable protooncogenes located within the 7q21.3 locus.

  13. Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis

    Science.gov (United States)

    Stata, Matt; Wang, Wei; White, Merlin M.; Moncalvo, Jean-Marc

    2018-01-01

    ABSTRACT Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. PMID:29764946

  14. Novel gene function revealed by mouse mutagenesis screens for models of age-related disease.

    Science.gov (United States)

    Potter, Paul K; Bowl, Michael R; Jeyarajan, Prashanthini; Wisby, Laura; Blease, Andrew; Goldsworthy, Michelle E; Simon, Michelle M; Greenaway, Simon; Michel, Vincent; Barnard, Alun; Aguilar, Carlos; Agnew, Thomas; Banks, Gareth; Blake, Andrew; Chessum, Lauren; Dorning, Joanne; Falcone, Sara; Goosey, Laurence; Harris, Shelley; Haynes, Andy; Heise, Ines; Hillier, Rosie; Hough, Tertius; Hoslin, Angela; Hutchison, Marie; King, Ruairidh; Kumar, Saumya; Lad, Heena V; Law, Gemma; MacLaren, Robert E; Morse, Susan; Nicol, Thomas; Parker, Andrew; Pickford, Karen; Sethi, Siddharth; Starbuck, Becky; Stelma, Femke; Cheeseman, Michael; Cross, Sally H; Foster, Russell G; Jackson, Ian J; Peirson, Stuart N; Thakker, Rajesh V; Vincent, Tonia; Scudamore, Cheryl; Wells, Sara; El-Amraoui, Aziz; Petit, Christine; Acevedo-Arozena, Abraham; Nolan, Patrick M; Cox, Roger; Mallon, Anne-Marie; Brown, Steve D M

    2016-08-18

    Determining the genetic bases of age-related disease remains a major challenge requiring a spectrum of approaches from human and clinical genetics to the utilization of model organism studies. Here we report a large-scale genetic screen in mice employing a phenotype-driven discovery platform to identify mutations resulting in age-related disease, both late-onset and progressive. We have utilized N-ethyl-N-nitrosourea mutagenesis to generate pedigrees of mutagenized mice that were subject to recurrent screens for mutant phenotypes as the mice aged. In total, we identify 105 distinct mutant lines from 157 pedigrees analysed, out of which 27 are late-onset phenotypes across a range of physiological systems. Using whole-genome sequencing we uncover the underlying genes for 44 of these mutant phenotypes, including 12 late-onset phenotypes. These genes reveal a number of novel pathways involved with age-related disease. We illustrate our findings by the recovery and characterization of a novel mouse model of age-related hearing loss.

  15. Transcriptomics reveal several gene expression patterns in the piezophile Desulfovibrio hydrothermalis in response to hydrostatic pressure.

    Directory of Open Access Journals (Sweden)

    Amira Amrani

    Full Text Available RNA-seq was used to study the response of Desulfovibrio hydrothermalis, isolated from a deep-sea hydrothermal chimney on the East-Pacific Rise at a depth of 2,600 m, to various hydrostatic pressure growth conditions. The transcriptomic datasets obtained after growth at 26, 10 and 0.1 MPa identified only 65 differentially expressed genes that were distributed among four main categories: aromatic amino acid and glutamate metabolisms, energy metabolism, signal transduction, and unknown function. The gene expression patterns suggest that D. hydrothermalis uses at least three different adaptation mechanisms, according to a hydrostatic pressure threshold (HPt that was estimated to be above 10 MPa. Both glutamate and energy metabolism were found to play crucial roles in these mechanisms. Quantitation of the glutamate levels in cells revealed its accumulation at high hydrostatic pressure, suggesting its role as a piezolyte. ATP measurements showed that the energy metabolism of this bacterium is optimized for deep-sea life conditions. This study provides new insights into the molecular mechanisms linked to hydrostatic pressure adaptation in sulfate-reducing bacteria.

  16. Separable roles of UFO during floral development revealed by conditional restoration of gene function.

    Science.gov (United States)

    Laufs, Patrick; Coen, Enrico; Kronenberger, Jocelyne; Traas, Jan; Doonan, John

    2003-02-01

    The UNUSUAL FLORAL ORGANS (UFO) gene is required for several aspects of floral development in Arabidopsis including specification of organ identity in the second and third whorls and the proper pattern of primordium initiation in the inner three whorls. UFO is expressed in a dynamic pattern during the early phases of flower development. Here we dissect the role of UFO by ubiquitously expressing it in ufo loss-of-function flowers at different developmental stages and for various durations using an ethanol-inducible expression system. The previously known functions of UFO could be separated and related to its expression at specific stages of development. We show that a 24- to 48-hour period of UFO expression from floral stage 2, before any floral organs are visible, is sufficient to restore normal petal and stamen development. The earliest requirement for UFO is during stage 2, when the endogenous UFO gene is transiently expressed in the centre of the wild-type flower and is required to specify the initiation patterns of petal, stamen and carpel primordia. Petal and stamen identity is determined during stages 2 or 3, when UFO is normally expressed in the presumptive second and third whorl. Although endogenous UFO expression is absent from the stamen whorl from stage 4 onwards, stamen identity can be restored by UFO activation up to stage 6. We also observed floral phenotypes not observed in loss-of-function or constitutive gain-of-function backgrounds, revealing additional roles of UFO in outgrowth of petal primordia.

  17. Analyses of soil microbial community compositions and functional genes reveal potential consequences of natural forest succession.

    Science.gov (United States)

    Cong, Jing; Yang, Yunfeng; Liu, Xueduan; Lu, Hui; Liu, Xiao; Zhou, Jizhong; Li, Diqiang; Yin, Huaqun; Ding, Junjun; Zhang, Yuguang

    2015-05-06

    The succession of microbial community structure and function is a central ecological topic, as microbes drive the Earth's biogeochemical cycles. To elucidate the response and mechanistic underpinnings of soil microbial community structure and metabolic potential relevant to natural forest succession, we compared soil microbial communities from three adjacent natural forests: a coniferous forest (CF), a mixed broadleaf forest (MBF) and a deciduous broadleaf forest (DBF) on Shennongjia Mountain in central China. In contrary to plant communities, the microbial taxonomic diversity of the DBF was significantly (P the DBF. Furthermore, a network analysis of microbial carbon and nitrogen cycling genes showed the network for the DBF samples was relatively large and tight, revealing strong couplings between microbes. Soil temperature, reflective of climate regimes, was important in shaping microbial communities at both taxonomic and functional gene levels. As a first glimpse of both the taxonomic and functional compositions of soil microbial communities, our results suggest that microbial community structure and function potentials will be altered by future environmental changes, which have implications for forest succession.

  18. Transcriptomics Reveal Several Gene Expression Patterns in the Piezophile Desulfovibrio hydrothermalis in Response to Hydrostatic Pressure

    Science.gov (United States)

    Amrani, Amira; Bergon, Aurélie; Holota, Hélène; Tamburini, Christian; Garel, Marc; Ollivier, Bernard; Imbert, Jean; Dolla, Alain; Pradel, Nathalie

    2014-01-01

    RNA-seq was used to study the response of Desulfovibrio hydrothermalis, isolated from a deep-sea hydrothermal chimney on the East-Pacific Rise at a depth of 2,600 m, to various hydrostatic pressure growth conditions. The transcriptomic datasets obtained after growth at 26, 10 and 0.1 MPa identified only 65 differentially expressed genes that were distributed among four main categories: aromatic amino acid and glutamate metabolisms, energy metabolism, signal transduction, and unknown function. The gene expression patterns suggest that D. hydrothermalis uses at least three different adaptation mechanisms, according to a hydrostatic pressure threshold (HPt) that was estimated to be above 10 MPa. Both glutamate and energy metabolism were found to play crucial roles in these mechanisms. Quantitation of the glutamate levels in cells revealed its accumulation at high hydrostatic pressure, suggesting its role as a piezolyte. ATP measurements showed that the energy metabolism of this bacterium is optimized for deep-sea life conditions. This study provides new insights into the molecular mechanisms linked to hydrostatic pressure adaptation in sulfate-reducing bacteria. PMID:25215865

  19. In vivo RNAi screen reveals neddylation genes as novel regulators of Hedgehog signaling.

    Directory of Open Access Journals (Sweden)

    Juan Du

    Full Text Available Hedgehog (Hh signaling is highly conserved in all metazoan animals and plays critical roles in many developmental processes. Dysregulation of the Hh signaling cascade has been implicated in many diseases, including cancer. Although key components of the Hh pathway have been identified, significant gaps remain in our understanding of the regulation of individual Hh signaling molecules. Here, we report the identification of novel regulators of the Hh pathway, obtained from an in vivo RNA interference (RNAi screen in Drosophila. By selectively targeting critical genes functioning in post-translational modification systems utilizing ubiquitin (Ub and Ub-like proteins, we identify two novel genes (dUba3 and dUbc12 that negatively regulate Hh signaling activity. We provide in vivo and in vitro evidence illustrating that dUba3 and dUbc12 are essential components of the neddylation pathway; they function in an enzyme cascade to conjugate the ubiquitin-like NEDD8 modifier to Cullin proteins. Neddylation activates the Cullin-containing ubiquitin ligase complex, which in turn promotes the degradation of Cubitus interruptus (Ci, the downstream transcription factor of the Hh pathway. Our study reveals a conserved molecular mechanism of the neddylation pathway in Drosophila and sheds light on the complex post-translational regulations in Hh signaling.

  20. Whole genome transcript profiling of drug induced steatosis in rats reveals a gene signature predictive of outcome.

    Directory of Open Access Journals (Sweden)

    Nishika Sahini

    Full Text Available Drug induced steatosis (DIS is characterised by excess triglyceride accumulation in the form of lipid droplets (LD in liver cells. To explore mechanisms underlying DIS we interrogated the publically available microarray data from the Japanese Toxicogenomics Project (TGP to study comprehensively whole genome gene expression changes in the liver of treated rats. For this purpose a total of 17 and 12 drugs which are diverse in molecular structure and mode of action were considered based on their ability to cause either steatosis or phospholipidosis, respectively, while 7 drugs served as negative controls. In our efforts we focused on 200 genes which are considered to be mechanistically relevant in the process of lipid droplet biogenesis in hepatocytes as recently published (Sahini and Borlak, 2014. Based on mechanistic considerations we identified 19 genes which displayed dose dependent responses while 10 genes showed time dependency. Importantly, the present study defined 9 genes (ANGPTL4, FABP7, FADS1, FGF21, GOT1, LDLR, GK, STAT3, and PKLR as signature genes to predict DIS. Moreover, cross tabulation revealed 9 genes to be regulated ≥10 times amongst the various conditions and included genes linked to glucose metabolism, lipid transport and lipogenesis as well as signalling events. Additionally, a comparison between drugs causing phospholipidosis and/or steatosis revealed 26 genes to be regulated in common including 4 signature genes to predict DIS (PKLR, GK, FABP7 and FADS1. Furthermore, a comparison between in vivo single dose (3, 6, 9 and 24 h and findings from rat hepatocyte studies (2 h, 8 h, 24 h identified 10 genes which are regulated in common and contained 2 DIS signature genes (FABP7, FGF21. Altogether, our studies provide comprehensive information on mechanistically linked gene expression changes of a range of drugs causing steatosis and phospholipidosis and encourage the screening of DIS signature genes at the preclinical stage.

  1. Building a Chemical Ontology using Methontology and the Ontology Design Environment

    OpenAIRE

    Fernández López, Mariano; Gómez-Pérez, A.; Pazos Sierra, Alejandro; Pazos Sierra, Juan

    1999-01-01

    METHONTOLOGY PROVIDES GUIDELINES FOR SPECIFYING ONTOLOGIES AT THE KNOWLEDGE LEVEL, AS A SPECIFICATION OF A CONCEPTUALIZATION. ODE ENABLES ONTOLOGY CONSTRUCTION, COVERING THE ENTIRE LIFE CYCLE AND AUTOMATICALLY IMPLEMENTING ONTOLOGIES

  2. Aligning ontologies and integrating textual evidence for pathway analysis of microarray data

    Energy Technology Data Exchange (ETDEWEB)

    Gopalan, Banu; Posse, Christian; Sanfilippo, Antonio P.; Stenzel-Poore, Mary; Stevens, S.L.; Castano, Jose; Beagley, Nathaniel; Riensche, Roderick M.; Baddeley, Bob; Simon, R.P.; Pustejovsky, James

    2006-10-08

    Expression arrays are introducing a paradigmatic change in biology by shifting experimental approaches from single gene studies to genome-level analysis, monitoring the ex-pression levels of several thousands of genes in parallel. The massive amounts of data obtained from the microarray data needs to be integrated and interpreted to infer biological meaning within the context of information-rich pathways. In this paper, we present a methodology that integrates textual information with annotations from cross-referenced ontolo-gies to map genes to pathways in a semi-automated way. We illustrate this approach and compare it favorably to other tools by analyzing the gene expression changes underlying the biological phenomena related to stroke. Stroke is the third leading cause of death and a major disabler in the United States. Through years of study, researchers have amassed a significant knowledge base about stroke, and this knowledge, coupled with new technologies, is providing a wealth of new scientific opportunities. The potential for neu-roprotective stroke therapy is enormous. However, the roles of neurogenesis, angiogenesis, and other proliferative re-sponses in the recovery process following ischemia and the molecular mechanisms that lead to these processes still need to be uncovered. Improved annotation of genomic and pro-teomic data, including annotation of pathways in which genes and proteins are involved, is required to facilitate their interpretation and clinical application. While our approach is not aimed at replacing existing curated pathway databases, it reveals multiple hidden relationships that are not evident with the way these databases analyze functional groupings of genes from the Gene Ontology.

  3. DNA Methylation and Gene Expression Profiling of Ewing Sarcoma Primary Tumors Reveal Genes That Are Potential Targets of Epigenetic Inactivation

    Directory of Open Access Journals (Sweden)

    Nikul Patel

    2012-01-01

    Full Text Available The role of aberrant DNA methylation in Ewing sarcoma is not completely understood. The methylation status of 503 genes in 52 formalin-fixed paraffin-embedded EWS tumors and 3 EWS cell lines was compared to human mesenchymal stem cell primary cultures (hMSCs using bead chip methylation analysis. Relative expression of methylated genes was assessed in 5-Aza-2-deoxycytidine-(5-AZA-treated EWS cell lines and in a cohort of primary EWS samples and hMSCs by gene expression and quantitative RT-PCR. 129 genes demonstrated statistically significant hypermethylation in EWS tumors compared to hMSCs. Thirty-six genes were profoundly methylated in EWS and unmethylated in hMSCs. 5-AZA treatment of EWS cell lines resulted in upregulation of expression of hundreds of genes including 162 that were increased by at least 2-fold. The expression of 19 of 36 candidate hypermethylated genes was increased following 5-AZA. Analysis of gene expression from an independent cohort of tumors confirmed decreased expression of six of nineteen hypermethylated genes (AXL, COL1A1, CYP1B1, LYN, SERPINE1, and VCAN. Comparing gene expression and DNA methylation analyses proved to be an effective way to identify genes epigenetically regulated in EWS. Further investigation is ongoing to elucidate the role of these epigenetic alterations in EWS pathogenesis.

  4. Comparative and functional triatomine genomics reveals reductions and expansions in insecticide resistance-related gene families.

    Science.gov (United States)

    Traverso, Lucila; Lavore, Andrés; Sierra, Ivana; Palacio, Victorio; Martinez-Barnetche, Jesús; Latorre-Estivalis, José Manuel; Mougabure-Cueto, Gaston; Francini, Flavio; Lorenzo, Marcelo G; Rodríguez, Mario Henry; Ons, Sheila; Rivera-Pomar, Rolando V

    2017-02-01

    Triatomine insects are vectors of Trypanosoma cruzi, a protozoan parasite that is the causative agent of Chagas' disease. This is a neglected disease affecting approximately 8 million people in Latin America. The existence of diverse pyrethroid resistant populations of at least two species demonstrates the potential of triatomines to develop high levels of insecticide resistance. Therefore, the incorporation of strategies for resistance management is a main concern for vector control programs. Three enzymatic superfamilies are thought to mediate xenobiotic detoxification and resistance: Glutathione Transferases (GSTs), Cytochromes P450 (CYPs) and Carboxyl/Cholinesterases (CCEs). Improving our knowledge of key triatomine detoxification enzymes will strengthen our understanding of insecticide resistance processes in vectors of Chagas' disease. The discovery and description of detoxification gene superfamilies in normalized transcriptomes of three triatomine species: Triatoma dimidiata, Triatoma infestans and Triatoma pallidipennis is presented. Furthermore, a comparative analysis of these superfamilies among the triatomine transcriptomes and the genome of Rhodnius prolixus, also a triatomine vector of Chagas' disease, and other well-studied insect genomes was performed. The expression pattern of detoxification genes in R. prolixus transcriptomes from key organs was analyzed. The comparisons reveal gene expansions in Sigma class GSTs, CYP3 in CYP superfamily and clade E in CCE superfamily. Moreover, several CYP families identified in these triatomines have not yet been described in other insects. Conversely, several groups of insecticide resistance related enzymes within each enzyme superfamily are reduced or lacking in triatomines. Furthermore, our qRT-PCR results showed an increase in the expression of a CYP4 gene in a T. infestans population resistant to pyrethroids. These results could point to an involvement of metabolic detoxification mechanisms on the high

  5. Antennal and Abdominal Transcriptomes Reveal Chemosensory Genes in the Asian Citrus Psyllid, Diaphorina citri.

    Science.gov (United States)

    Wu, Zhongzhen; Zhang, He; Bin, Shuying; Chen, Lei; Han, Qunxin; Lin, Jintian

    2016-01-01

    The Asian citrus psyllid, Diaphorina citri is the principal vector of the highly destructive citrus disease called Huanglongbing (HLB) or citrus greening, which is a major threat to citrus cultivation worldwide. More effective pest control strategies against this pest entail the identification of potential chemosensory proteins that could be used in the development of attractants or repellents. However, the molecular basis of olfaction in the Asian citrus psyllid is not completely understood. Therefore, we performed this study to analyze the antennal and abdominal transcriptome of the Asian citrus psyllid. We identified a large number of transcripts belonging to nine chemoreception-related gene families and compared their expression in male and female adult antennae and terminal abdomen. In total, 9 odorant binding proteins (OBPs), 12 chemosensory proteins (CSPs), 46 odorant receptors (ORs), 20 gustatory receptors (GRs), 35 ionotropic receptors (IRs), 4 sensory neuron membrane proteins (SNMPs) and 4 different gene families encoding odorant-degrading enzymes (ODEs): 80 cytochrome P450s (CYPs), 12 esterase (ESTs), and 5 aldehyde dehydrogenases (ADE) were annotated in the D. citri antennal and abdominal transcriptomes. Our results revealed that a large proportion of chemosensory genes exhibited no distinct differences in their expression patterns in the antennae and terminal abdominal tissues. Notably, RNA sequencing (RNA-seq) data and quantitative real time-PCR (qPCR) analyses showed that 4 DictOBPs, 4 DictCSPs, 4 DictIRs, 1 DictSNMP, and 2 DictCYPs were upregulated in the antennae relative to that in terminal abdominal tissues. Furthermore, 2 DictOBPs (DictOBP8 and DictOBP9), 2 DictCSPs (DictOBP8 and DictOBP12), 4 DictIRs (DictIR3, DictIR6, DictIR10, and DictIR35), and 1 DictCYP (DictCYP57) were expressed at higher levels in the male antennae than in the female antennae. Our study provides the first insights into the molecular basis of chemoreception in this insect

  6. Antennal and Abdominal Transcriptomes Reveal Chemosensory Genes in the Asian Citrus Psyllid, Diaphorina citri.

    Directory of Open Access Journals (Sweden)

    Zhongzhen Wu

    Full Text Available The Asian citrus psyllid, Diaphorina citri is the principal vector of the highly destructive citrus disease called Huanglongbing (HLB or citrus greening, which is a major threat to citrus cultivation worldwide. More effective pest control strategies against this pest entail the identification of potential chemosensory proteins that could be used in the development of attractants or repellents. However, the molecular basis of olfaction in the Asian citrus psyllid is not completely understood. Therefore, we performed this study to analyze the antennal and abdominal transcriptome of the Asian citrus psyllid. We identified a large number of transcripts belonging to nine chemoreception-related gene families and compared their expression in male and female adult antennae and terminal abdomen. In total, 9 odorant binding proteins (OBPs, 12 chemosensory proteins (CSPs, 46 odorant receptors (ORs, 20 gustatory receptors (GRs, 35 ionotropic receptors (IRs, 4 sensory neuron membrane proteins (SNMPs and 4 different gene families encoding odorant-degrading enzymes (ODEs: 80 cytochrome P450s (CYPs, 12 esterase (ESTs, and 5 aldehyde dehydrogenases (ADE were annotated in the D. citri antennal and abdominal transcriptomes. Our results revealed that a large proportion of chemosensory genes exhibited no distinct differences in their expression patterns in the antennae and terminal abdominal tissues. Notably, RNA sequencing (RNA-seq data and quantitative real time-PCR (qPCR analyses showed that 4 DictOBPs, 4 DictCSPs, 4 DictIRs, 1 DictSNMP, and 2 DictCYPs were upregulated in the antennae relative to that in terminal abdominal tissues. Furthermore, 2 DictOBPs (DictOBP8 and DictOBP9, 2 DictCSPs (DictOBP8 and DictOBP12, 4 DictIRs (DictIR3, DictIR6, DictIR10, and DictIR35, and 1 DictCYP (DictCYP57 were expressed at higher levels in the male antennae than in the female antennae. Our study provides the first insights into the molecular basis of chemoreception in this

  7. Plasmid Complement of Lactococcus lactis NCDO712 Reveals a Novel Pilus Gene Cluster.

    Science.gov (United States)

    Tarazanova, Mariya; Beerthuyzen, Marke; Siezen, Roland; Fernandez-Gutierrez, Marcela M; de Jong, Anne; van der Meulen, Sjoerd; Kok, Jan; Bachmann, Herwig

    2016-01-01

    Lactococcus lactis MG1363 is an important gram-positive model organism. It is a plasmid-free and phage-cured derivative of strain NCDO712. Plasmid-cured strains facilitate studies on molecular biological aspects, but many properties which make L. lactis an important organism in the dairy industry are plasmid encoded. We sequenced the total DNA of strain NCDO712 and, contrary to earlier reports, revealed that the strain carries 6 rather than 5 plasmids. A new 50-kb plasmid, designated pNZ712, encodes functional nisin immunity (nisCIP) and copper resistance (lcoRSABC). The copper resistance could be used as a marker for the conjugation of pNZ712 to L. lactis MG1614. A genome comparison with the plasmid cured daughter strain MG1363 showed that the number of single nucleotide polymorphisms that accumulated in the laboratory since the strains diverted more than 30 years ago is limited to 11 of which only 5 lead to amino acid changes. The 16-kb plasmid pSH74 was found to contain a novel 8-kb pilus gene cluster spaCB-spaA-srtC1-srtC2, which is predicted to encode a pilin tip protein SpaC, a pilus basal subunit SpaB, and a pilus backbone protein SpaA. The sortases SrtC1/SrtC2 are most likely involved in pilus polymerization while the chromosomally encoded SrtA could act to anchor the pilus to peptidoglycan in the cell wall. Overexpression of the pilus gene cluster from a multi-copy plasmid in L. lactis MG1363 resulted in cell chaining, aggregation, rapid sedimentation and increased conjugation efficiency of the cells. Electron microscopy showed that the over-expression of the pilus gene cluster leads to appendices on the cell surfaces. A deletion of the gene encoding the putative basal protein spaB, by truncating spaCB, led to more pilus-like structures on the cell surface, but cell aggregation and cell chaining were no longer observed. This is consistent with the prediction that spaB is involved in the anchoring of the pili to the cell.

  8. Comparative and functional triatomine genomics reveals reductions and expansions in insecticide resistance-related gene families.

    Directory of Open Access Journals (Sweden)

    Lucila Traverso

    2017-02-01

    Full Text Available Triatomine insects are vectors of Trypanosoma cruzi, a protozoan parasite that is the causative agent of Chagas' disease. This is a neglected disease affecting approximately 8 million people in Latin America. The existence of diverse pyrethroid resistant populations of at least two species demonstrates the potential of triatomines to develop high levels of insecticide resistance. Therefore, the incorporation of strategies for resistance management is a main concern for vector control programs. Three enzymatic superfamilies are thought to mediate xenobiotic detoxification and resistance: Glutathione Transferases (GSTs, Cytochromes P450 (CYPs and Carboxyl/Cholinesterases (CCEs. Improving our knowledge of key triatomine detoxification enzymes will strengthen our understanding of insecticide resistance processes in vectors of Chagas' disease.The discovery and description of detoxification gene superfamilies in normalized transcriptomes of three triatomine species: Triatoma dimidiata, Triatoma infestans and Triatoma pallidipennis is presented. Furthermore, a comparative analysis of these superfamilies among the triatomine transcriptomes and the genome of Rhodnius prolixus, also a triatomine vector of Chagas' disease, and other well-studied insect genomes was performed. The expression pattern of detoxification genes in R. prolixus transcriptomes from key organs was analyzed. The comparisons reveal gene expansions in Sigma class GSTs, CYP3 in CYP superfamily and clade E in CCE superfamily. Moreover, several CYP families identified in these triatomines have not yet been described in other insects. Conversely, several groups of insecticide resistance related enzymes within each enzyme superfamily are reduced or lacking in triatomines. Furthermore, our qRT-PCR results showed an increase in the expression of a CYP4 gene in a T. infestans population resistant to pyrethroids. These results could point to an involvement of metabolic detoxification mechanisms

  9. Global gene expression in larval zebrafish (Danio rerio) exposed to selective serotonin reuptake inhibitors (fluoxetine and sertraline) reveals unique expression profiles and potential biomarkers of exposure

    International Nuclear Information System (INIS)

    Park, June-Woo; Heah, Tze Ping; Gouffon, Julia S.; Henry, Theodore B.; Sayler, Gary S.

    2012-01-01

    Larval zebrafish (Danio rerio) were exposed (96 h) to selective serotonin reuptake inhibitors (SSRIs) fluoxetine and sertraline and changes in transcriptomes analyzed by Affymetrix GeneChip ® Zebrafish Array were evaluated to enhance understanding of biochemical pathways and differences between these SSRIs. The number of genes differentially expressed after fluoxetine exposure was 288 at 25 μg/L and 131 at 250 μg/L; and after sertraline exposure was 33 at 25 μg/L and 52 at 250 μg/L. Same five genes were differentially regulated in both SSRIs indicating shared molecular pathways. Among these, the gene coding for FK506 binding protein 5, annotated to stress response regulation, was highly down-regulated in all treatments (results confirmed by qRT-PCR). Gene ontology analysis indicated at the gene expression level that regulation of stress response and cholinesterase activities were influenced by these SSRIs, and suggested that changes in transcription of these genes could be used as biomarkers of SSRI exposure. - Highlights: ► Exposure of zebrafish to selective serotonin reuptake inhibitors (SSRIs). ► Fluoxetine and sertraline generate different global gene expression profiles. ► Genes linked to stress response and acetylcholine esterase affected by both SSRIs. - Global gene expression profiles in zebrafish exposed to selective serotonin reuptake inhibitors.

  10. The organization structure and regulatory elements of Chlamydomonas histone genes reveal features linking plant and animal genes.

    Science.gov (United States)

    Fabry, S; Müller, K; Lindauer, A; Park, P B; Cornelius, T; Schmitt, R

    1995-09-01

    The genome of the green alga Chlamydomonas reinhardtii contains approximately 15 gene clusters of the nucleosomal (or core) histone H2A, H2B, H3 and H4 genes and at least one histone H1 gene. Seven non-allelic histone gene loci were isolated from a genomic library, physically mapped, and the nucleotide sequences of three isotypes of each core histone gene species and one linked H1 gene determined. The core histone genes are organized in clusters of H2A-H2B and H3-H4 pairs, in which each gene pair shows outwardly divergent transcription from a short (< 300 bp) intercistronic region. These intercistronic regions contain typically conserved promoter elements, namely a TATA-box and the three motifs TGGCCAG-G(G/C)-CGAG, CGTTGACC and CGGTTG. Different from the genes of higher plants, but like those of animals and the related alga Volvox, the 3' untranslated regions contain no poly A signal, but a palindromic sequence (3' palindrome) essential for mRNA processing is present. One single H1 gene was found in close linkage to a H2A-H2B pair. The H1 upstream region contains the octameric promoter element GGTTGACC (also found upstream of the core histone genes) and two specific sequence motifs that are shared only with the Volvox H1 promoters. This suggests differential transcription of the H1 and the core histone genes. The H1 gene is interrupted by two introns. Unlike Volvox H3 genes, the three sequenced H3 isoforms are intron-free. Primer-directed PCR of genomic DNA demonstrated, however, that at least 8 of the about 15 H3 genes do contain one intron at a conserved position. In synchronized C. reinhardtii cells, H4 mRNA levels (representative of all core histone mRNAs) peak during cell division, suggesting strict replication-dependent gene control. The derived peptide sequences place C. reinhardtii core histones closer to plants than to animals, except that the H2A histones are more animal-like. The peptide sequence of histone H1 is closely related to the V. carteri VH1-II

  11. Transcriptome Analysis Reveals Regulation of Gene Expression for Lipid Catabolism in Young Broilers by Butyrate Glycerides

    Science.gov (United States)

    Yin, Fugui; Yu, Hai; Lepp, Dion; Shi, Xuejiang; Yang, Xiaojian; Hu, Jielun; Leeson, Steve; Yang, Chengbo; Nie, Shaoping; Hou, Yongqing; Gong, Joshua

    2016-01-01

    indicated that dietary BG intervention induced 79 and 205 characterized DEGs in the jejunum and liver, respectively. In addition, 255 and 165 TSEGs were detected in the liver and jejunum of BG-fed group, while 162 and 211 TSEGs genes were observed in the liver and jejunum of BD-fed birds, respectively. Bioinformatic analysis with both IPA and DAVID-BR further revealed a significant enrichment of DEGs and TSEGs in the biological processes for reducing the synthesis, storage, transportation and secretion of lipids in the jejunum, while those in the liver were for enhancing the oxidation of ingested lipids and fatty acids. In particular, transcriptional regulators of THRSP and EGR-1 as well as several DEGs involved in the PPAR-α signaling pathway were significantly induced by dietary BG intervention for lipid catabolism. Conclusions Our results demonstrate that BG reduces body fat deposition via regulation of gene expression, which is involved in the biological events relating to the reduction of synthesis, storage, transportation and secretion, and improvement of oxidation of lipids and fatty acids. PMID:27508934

  12. Hox gene cluster of the ascidian, Halocynthia roretzi, reveals multiple ancient steps of cluster disintegration during ascidian evolution.

    Science.gov (United States)

    Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi

    2017-01-01

    Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis ( Ci ), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi ( Hr ) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1 , 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a , 11/12/13.b and HoxX . To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr , distinct from their arrangement in Ci . We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1 , 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci , we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci

  13. Phylogeny of haemosporidian blood parasites revealed by a multi-gene approach.

    Science.gov (United States)

    Borner, Janus; Pick, Christian; Thiede, Jenny; Kolawole, Olatunji Matthew; Kingsley, Manchang Tanyi; Schulze, Jana; Cottontail, Veronika M; Wellinghausen, Nele; Schmidt-Chanasit, Jonas; Bruchhaus, Iris; Burmester, Thorsten

    2016-01-01

    The apicomplexan order Haemosporida is a clade of unicellular blood parasites that infect a variety of reptilian, avian and mammalian hosts. Among them are the agents of human malaria, parasites of the genus Plasmodium, which pose a major threat to human health. Illuminating the evolutionary history of Haemosporida may help us in understanding their enormous biological diversity, as well as tracing the multiple host switches and associated acquisitions of novel life-history traits. However, the deep-level phylogenetic relationships among major haemosporidian clades have remained enigmatic because the datasets employed in phylogenetic analyses were severely limited in either gene coverage or taxon sampling. Using a PCR-based approach that employs a novel set of primers, we sequenced fragments of 21 nuclear genes from seven haemosporidian parasites of the genera Leucocytozoon, Haemoproteus, Parahaemoproteus, Polychromophilus and Plasmodium. After addition of genomic data from 25 apicomplexan species, the unreduced alignment comprised 20,580 bp from 32 species. Phylogenetic analyses were performed based on nucleotide, codon and amino acid data employing Bayesian inference, maximum likelihood and maximum parsimony. All analyses resulted in highly congruent topologies. We found consistent support for a basal position of Leucocytozoon within Haemosporida. In contrast to all previous studies, we recovered a sister group relationship between the genera Polychromophilus and Plasmodium. Within Plasmodium, the sauropsid and mammal-infecting lineages were recovered as sister clades. Support for these relationships was high in nearly all trees, revealing a novel phylogeny of Haemosporida, which is robust to the choice of the outgroup and the method of tree inference. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. RNA-Seq profiling reveals novel hepatic gene expression pattern in aflatoxin B1 treated rats.

    Directory of Open Access Journals (Sweden)

    B Alex Merrick

    Full Text Available Deep sequencing was used to investigate the subchronic effects of 1 ppm aflatoxin B1 (AFB1, a potent hepatocarcinogen, on the male rat liver transcriptome prior to onset of histopathological lesions or tumors. We hypothesized RNA-Seq would reveal more differentially expressed genes (DEG than microarray analysis, including low copy and novel transcripts related to AFB1's carcinogenic activity compared to feed controls (CTRL. Paired-end reads were mapped to the rat genome (Rn4 with TopHat and further analyzed by DESeq and Cufflinks-Cuffdiff pipelines to identify differentially expressed transcripts, new exons and unannotated transcripts. PCA and cluster analysis of DEGs showed clear separation between AFB1 and CTRL treatments and concordance among group replicates. qPCR of eight high and medium DEGs and three low DEGs showed good comparability among RNA-Seq and microarray transcripts. DESeq analysis identified 1,026 differentially expressed transcripts at greater than two-fold change (p<0.005 compared to 626 transcripts by microarray due to base pair resolution of transcripts by RNA-Seq, probe placement within transcripts or an absence of probes to detect novel transcripts, splice variants and exons. Pathway analysis among DEGs revealed signaling of Ahr, Nrf2, GSH, xenobiotic, cell cycle, extracellular matrix, and cell differentiation networks consistent with pathways leading to AFB1 carcinogenesis, including almost 200 upregulated transcripts controlled by E2f1-related pathways related to kinetochore structure, mitotic spindle assembly and tissue remodeling. We report 49 novel, differentially-expressed transcripts including confirmation by PCR-cloning of two unique, unannotated, hepatic AFB1-responsive transcripts (HAfT's on chromosomes 1.q55 and 15.q11, overexpressed by 10 to 25-fold. Several potentially novel exons were found and exon refinements were made including AFB1 exon-specific induction of homologous family members, Ugt1a6 and Ugt1a7c

  15. RNA-Seq profiling reveals novel hepatic gene expression pattern in aflatoxin B1 treated rats.

    Science.gov (United States)

    Merrick, B Alex; Phadke, Dhiral P; Auerbach, Scott S; Mav, Deepak; Stiegelmeyer, Suzy M; Shah, Ruchir R; Tice, Raymond R

    2013-01-01

    Deep sequencing was used to investigate the subchronic effects of 1 ppm aflatoxin B1 (AFB1), a potent hepatocarcinogen, on the male rat liver transcriptome prior to onset of histopathological lesions or tumors. We hypothesized RNA-Seq would reveal more differentially expressed genes (DEG) than microarray analysis, including low copy and novel transcripts related to AFB1's carcinogenic activity compared to feed controls (CTRL). Paired-end reads were mapped to the rat genome (Rn4) with TopHat and further analyzed by DESeq and Cufflinks-Cuffdiff pipelines to identify differentially expressed transcripts, new exons and unannotated transcripts. PCA and cluster analysis of DEGs showed clear separation between AFB1 and CTRL treatments and concordance among group replicates. qPCR of eight high and medium DEGs and three low DEGs showed good comparability among RNA-Seq and microarray transcripts. DESeq analysis identified 1,026 differentially expressed transcripts at greater than two-fold change (p<0.005) compared to 626 transcripts by microarray due to base pair resolution of transcripts by RNA-Seq, probe placement within transcripts or an absence of probes to detect novel transcripts, splice variants and exons. Pathway analysis among DEGs revealed signaling of Ahr, Nrf2, GSH, xenobiotic, cell cycle, extracellular matrix, and cell differentiation networks consistent with pathways leading to AFB1 carcinogenesis, including almost 200 upregulated transcripts controlled by E2f1-related pathways related to kinetochore structure, mitotic spindle assembly and tissue remodeling. We report 49 novel, differentially-expressed transcripts including confirmation by PCR-cloning of two unique, unannotated, hepatic AFB1-responsive transcripts (HAfT's) on chromosomes 1.q55 and 15.q11, overexpressed by 10 to 25-fold. Several potentially novel exons were found and exon refinements were made including AFB1 exon-specific induction of homologous family members, Ugt1a6 and Ugt1a7c. We find the

  16. Genes expressed in grapevine leaves reveal latent wood infection by the fungal pathogen Neofusicoccum parvum.

    Directory of Open Access Journals (Sweden)

    Stefan Czemmel

    Full Text Available Some pathogenic species of the Botryosphaeriaceae have a latent phase, colonizing woody tissues while perennial hosts show no apparent symptoms until conditions for disease development become favorable. Detection of these pathogens is often limited to the later pathogenic phase. The latent phase is poorly characterized, despite the need for non-destructive detection tools and effective quarantine strategies, which would benefit from identification of host-based markers in leaves. Neofusicoccum parvum infects the wood of grapevines and other horticultural crops, killing the fruit-bearing shoots. We used light microscopy and high-resolution computed tomography (HRCT to examine the spatio-temporal relationship between pathogen colonization and anatomical changes in stem sections. To identify differentially-expressed grape genes, leaves from inoculated and non-inoculated plants were examined using RNA-Seq. The latent phase occurred between 0 and 1.5 months post-inoculation (MPI, during which time the pathogen did not spread significantly beyond the inoculation site nor were there differences in lesion lengths between inoculated and non-inoculated plants. The pathogenic phase occurred between 1.5 and 2 MPI, when recovery beyond the inoculation site increased and lesion lengths of inoculated plants tripled. By 2 MPI, inoculated plants also had decreased starch content in xylem fibers and rays, and increased levels of gel-occluded xylem vessels, the latter of which HRCT revealed at a higher frequency than microscopy. RNA-Seq and screening of 21 grape expression datasets identified 20 candidate genes that were transcriptionally-activated by infection during the latent phase, and confirmed that the four best candidates (galactinol synthase, abscisic acid-induced wheat plasma membrane polypeptide-19 ortholog, embryonic cell protein 63, BURP domain-containing protein were not affected by a range of common foliar and wood pathogens or abiotic stresses

  17. Genomic Analysis Reveals Contrasting PIFq Contribution to Diurnal Rhythmic Gene Expression in PIF-Induced and -Repressed Genes.

    Science.gov (United States)

    Martin, Guiomar; Soy, Judit; Monte, Elena

    2016-01-01

    Members of the PIF quartet (PIFq; PIF1, PIF3, PIF4, and PIF5) collectively contribute to induce growth in Arabidopsis seedlings under short day (SD) conditions, specifically promoting elongation at dawn. Their action involves the direct regulation of growth-related and hormone-associated genes. However, a comprehensive definition of the PIFq-regulated transcriptome under SD is still lacking. We have recently shown that SD and free-running (LL) conditions correspond to "growth" and "no growth" conditions, respectively, correlating with greater abundance of PIF protein in SD. Here, we present a genomic analysis whereby we first define SD-regulated genes at dawn compared to LL in the wild type, followed by identification of those SD-regulated genes whose expression depends on the presence of PIFq. By using this sequential strategy, we have identified 349 PIF/SD-regulated genes, approximately 55% induced and 42% repressed by both SD and PIFq. Comparison with available databases indicates that PIF/SD-induced and PIF/SD-repressed sets are differently phased at dawn and mid-morning, respectively. In addition, we found that whereas rhythmicity of the PIF/SD-induced gene set is lost in LL, most PIF/SD-repressed genes keep their rhythmicity in LL, suggesting differential regulation of both gene sets by the circadian clock. Moreover, we also uncovered distinct overrepresented functions in the induced and repressed gene sets, in accord with previous studies in other examined PIF-regulated processes. Interestingly, promoter analyses showed that, whereas PIF/SD-induced genes are enriched in direct PIF targets, PIF/SD-repressed genes are mostly indirectly regulated by the PIFs and might be more enriched in ABA-regulated genes.

  18. Transcriptomic analysis reveals the gene expression profile that specifically responds to IBA during adventitious rooting in mung bean seedlings.

    Science.gov (United States)

    Li, Shi-Weng; Shi, Rui-Fang; Leng, Yan; Zhou, Yuan

    2016-01-12

    Auxin plays a critical role in inducing adventitious rooting in many plants. Indole-3-butyric acid (IBA) is the most widely employed auxin for adventitious rooting. However, the molecular mechanisms by which auxin regulate the process of adventitious rooting are less well known. The RNA-Seq data analysis indicated that IBA treatment greatly increased the amount of clean reads and the amount of expressed unigenes by 24.29 % and 27.42 % and by 4.3 % and 5.04 % at two time points, respectively, and significantly increased the numbers of unigenes numbered with RPKM = 10-100 and RPKM = 500-1000 by 13.04 % and 3.12 % and by 24.66 % and 108.2 % at two time points, respectively. Gene Ontology (GO) enrichment analysis indicated that the enrichment of down-regulated GOs was 2.87-fold higher than that of up-regulated GOs at stage 1, suggesting that IBA significantly down-regulated gene expression at 6 h. The GO functional category indicated that IBA significantly up- or down-regulated processes associated with auxin signaling, ribosome assembly and protein synthesis, photosynthesis, oxidoreductase activity and extracellular region, secondary cell wall biogenesis, and the cell wall during the development process. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment indicated that ribosome biogenesis, plant hormone signal transduction, pentose and glucuronate interconversions, photosynthesis, phenylpropanoid biosynthesis, sesquiterpenoid and triterpenoid biosynthesis, ribosome, cutin, flavonoid biosynthesis, and phenylalanine metabolism were the pathways most highly regulated by IBA. A total of 6369 differentially expressed (2-fold change > 2) unigenes (DEGs) with 3693 (58 %) that were up-regulated and 2676 (42 %) down-regulated, 5433 unigenes with 2208 (40.6 %) that were up-regulated and 3225 (59.4 %) down-regulated, and 7664 unigenes with 3187 (41.6 %) that were up-regulated and 4477 (58.4 %) down-regulated were detected at stage 1

  19. Comparative Plasmodium gene overexpression reveals distinct perturbation of sporozoite transmission by profilin.

    Science.gov (United States)

    Sato, Yuko; Hliscs, Marion; Dunst, Josefine; Goosmann, Christian; Brinkmann, Volker; Montagna, Georgina N; Matuschewski, Kai

    2016-07-15

    Plasmodium relies on actin-based motility to migrate from the site of infection and invade target cells. Using a substrate-dependent gliding locomotion, sporozoites are able to move at fast speed (1-3 μm/s). This motility relies on a minimal set of actin regulatory proteins and occurs in the absence of detectable filamentous actin (F-actin). Here we report an overexpression strategy to investigate whether perturbations of F-actin steady-state levels affect gliding locomotion and host invasion. We selected two vital Plasmodium berghei G-actin-binding proteins, C-CAP and profilin, in combination with three stage-specific promoters and mapped the phenotypes afforded by overexpression in all three extracellular motile stages. We show that in merozoites and ookinetes, additional expression does not impair life cycle progression. In marked contrast, overexpression of C-CAP and profilin in sporozoites impairs circular gliding motility and salivary gland invasion. The propensity for productive motility correlates with actin accumulation at the parasite tip, as revealed by combinations of an actin-stabilizing drug and transgenic parasites. Strong expression of profilin, but not C-CAP, resulted in complete life cycle arrest. Comparative overexpression is an alternative experimental genetic strategy to study essential genes and reveals effects of regulatory imbalances that are not uncovered from deletion-mutant phenotyping. © 2016 Sato et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  20. RNA-seq reveals more consistent reference genes for gene expression studies in human non-melanoma skin cancers

    Directory of Open Access Journals (Sweden)

    Van L.T. Hoang

    2017-08-01

    Full Text Available Identification of appropriate reference genes (RGs is critical to accurate data interpretation in quantitative real-time PCR (qPCR experiments. In this study, we have utilised next generation RNA sequencing (RNA-seq to analyse the transcriptome of a panel of non-melanoma skin cancer lesions, identifying genes that are consistently expressed across all samples. Genes encoding ribosomal proteins were amongst the most stable in this dataset. Validation of this RNA-seq data was examined using qPCR to confirm the suitability of a set of highly stable genes for use as qPCR RGs. These genes will provide a valuable resource for the normalisation of qPCR data for the analysis of non-melanoma skin cancer.

  1. Weighted gene co-expression network analysis reveals potential genes involved in early metamorphosis process in sea cucumber Apostichopus japonicus.

    Science.gov (United States)

    Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen

    2018-01-01

    Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Use of the CIM Ontology

    Energy Technology Data Exchange (ETDEWEB)

    Neumann, Scott; Britton, Jay; Devos, Arnold N.; Widergren, Steven E.

    2006-02-08

    There are many uses for the Common Information Model (CIM), an ontology that is being standardized through Technical Committee 57 of the International Electrotechnical Commission (IEC TC57). The most common uses to date have included application modeling, information exchanges, information management and systems integration. As one should expect, there are many issues that become apparent when the CIM ontology is applied to any one use. Some of these issues are shortcomings within the current draft of the CIM, and others are a consequence of the different ways in which the CIM can be applied using different technologies. As the CIM ontology will and should evolve, there are several dangers that need to be recognized. One is overall consistency and impact upon applications when extending the CIM for a specific need. Another is that a tight coupling of the CIM to specific technologies could limit the value of the CIM in the longer term as an ontology, which becomes a larger issue over time as new technologies emerge. The integration of systems is one specific area of interest for application of the CIM ontology. This is an area dominated by the use of XML for the definition of messages. While this is certainly true when using Enterprise Application Integration (EAI) products, it is even more true with the movement towards the use of Web Services (WS), Service-Oriented Architectures (SOA) and Enterprise Service Buses (ESB) for integration. This general IT industry trend is consistent with trends seen within the IEC TC57 scope of power system management and associated information exchange. The challenge for TC57 is how to best leverage the CIM ontology using the various XML technologies and standards for integration. This paper will provide examples of how the CIM ontology is used and describe some specific issues that should be addressed within the CIM in order to increase its usefulness as an ontology. It will also describe some of the issues and challenges that will

  3. Changes in cecal microbiota and mucosal gene expression revealed new aspects of epizootic rabbit enteropathy.

    Directory of Open Access Journals (Sweden)

    Christine Bäuerl

    Full Text Available Epizootic Rabbit Enteropathy (ERE is a severe disease of unknown aetiology that mainly affects post-weaning animals. Its incidence can be prevented by antibiotic treatment suggesting that bacterial elements are crucial for the development of the disease. Microbial dynamics and host responses during the disease were studied. Cecal microbiota was characterized in three rabbit groups (ERE-affected, healthy and healthy pretreated with antibiotics, followed by transcriptional analysis of cytokines and mucins in the cecal mucosa and vermix by q-rtPCR. In healthy animals, cecal microbiota with or without antibiotic pretreatment was very similar and dominated by Alistipes and Ruminococcus. Proportions of both genera decreased in ERE rabbits whereas Bacteroides, Akkermansia and Rikenella increased, as well as Clostridium, γ-Proteobacteria and other opportunistic and pathogenic species. The ERE group displayed remarkable dysbiosis and reduced taxonomic diversity. Transcription rate of mucins and inflammatory cytokines was very high in ERE rabbits, except IL-2, and its analysis revealed the existence of two clearly different gene expression patterns corresponding to Inflammatory and (mucin Secretory Profiles. Furthermore, these profiles were associated to different bacterial species, suggesting that they may correspond to different stages of the disease. Other data obtained in this work reinforced the notion that ERE morbidity and mortality is possibly caused by an overgrowth of different pathogens in the gut of animals whose immune defence mechanisms seem not to be adequately responding.

  4. Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures.

    Directory of Open Access Journals (Sweden)

    Moon Young Lee

    Full Text Available Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC, which serve as slow-wave electrical pacemakers for gastrointestinal (GI smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies.

  5. Complexity of CNC transcription factors as revealed by gene targeting of the Nrf3 locus.

    Science.gov (United States)

    Derjuga, Anna; Gourley, Tania S; Holm, Teresa M; Heng, Henry H Q; Shivdasani, Ramesh A; Ahmed, Rafi; Andrews, Nancy C; Blank, Volker

    2004-04-01

    Cap'n'collar (CNC) family basic leucine zipper transcription factors play crucial roles in the regulation of mammalian gene expression and development. To determine the in vivo function of the CNC protein Nrf3 (NF-E2-related factor 3), we generated mice deficient in this transcription factor. We performed targeted disruption of two Nrf3 exons coding for CNC homology, basic DNA-binding, and leucine zipper dimerization domains. Nrf3 null mice developed normally and revealed no obvious phenotypic differences compared to wild-type animals. Nrf3(-/-) mice were fertile, and gross anatomy as well as behavior appeared normal. The mice showed normal age progression and did not show any apparent additional phenotype during their life span. We observed no differences in various blood parameters and chemistry values. We infected wild-type and Nrf3(-/-) mice with acute lymphocytic choriomeningitis virus and found no differences in these animals with respect to their number of virus-specific CD8 and CD4 T cells as well as their B-lymphocyte response. To determine whether the mild phenotype of Nrf3 null animals is due to functional redundancy, we generated mice deficient in multiple CNC factors. Contrary to our expectations, an absence of Nrf3 does not seem to cause additional lethality in compound Nrf3(-/-)/Nrf2(-/-) and Nrf3(-/-)/p45(-/-) mice. We hypothesize that the role of Nrf3 in vivo may become apparent only after appropriate challenge to the mice.

  6. Suicide Gene-Engineered Stromal Cells Reveal a Dynamic Regulation of Cancer Metastasis

    Science.gov (United States)

    Shen, Keyue; Luk, Samantha; Elman, Jessica; Murray, Ryan; Mukundan, Shilpaa; Parekkadan, Biju

    2016-02-01

    Cancer-associated fibroblasts (CAFs) are a major cancer-promoting component in the tumor microenvironment (TME). The dynamic role of human CAFs in cancer progression has been ill-defined because human CAFs lack a unique marker needed for a cell-specific, promoter-driven knockout model. Here, we developed an engineered human CAF cell line with an inducible suicide gene to enable selective in vivo elimination of human CAFs at different stages of xenograft tumor development, effectively circumventing the challenge of targeting a cell-specific marker. Suicide-engineered CAFs were highly sensitive to apoptosis induction in vitro and in vivo by the addition of a simple small molecule inducer. Selection of timepoints for targeted CAF apoptosis in vivo during the progression of a human breast cancer xenograft model was guided by a bi-phasic host cytokine response that peaked at early timepoints after tumor implantation. Remarkably, we observed that the selective apoptosis of CAFs at these early timepoints did not affect primary tumor growth, but instead increased the presence of tumor-associated macrophages and the metastatic spread of breast cancer cells to the lung and bone. The study revealed a dynamic relationship between CAFs and cancer metastasis that has counter-intuitive ramifications for CAF-targeted therapy.

  7. Autozygosity reveals recessive mutations and novel mechanisms in dominant genes: implications in variant interpretation

    KAUST Repository

    Monies, Dorota; Maddirevula, Sateesh; Kurdi, Wesam; Alanazy, Mohammed H.; Alkhalidi, Hisham; Al-Owain, Mohammed; Sulaiman, Raashda A.; Faqeih, Eissa; Goljan, Ewa; Ibrahim, Niema; Abdulwahab, Firdous; Hashem, Mais; Abouelhoda, Mohamed; Shaheen, Ranad; Arold, Stefan T.; Alkuraya, Fowzan S.

    2017-01-01

    The purpose of this study is to describe recessive alleles in strictly dominant genes. Identifying recessive mutations in genes for which only dominant disease or risk alleles have been reported can expand our understanding of the medical relevance

  8. Genomewide analysis of MATE-type gene family in maize reveals ...

    Indian Academy of Sciences (India)

    Huasheng Zhu and Jiandong Wu contributed equally to this work. As a group of secondary active transporters, the MATE gene family consists of multiple genes that widely exist in ..... Roots of the stress-treated plants were collected at 0,.

  9. The nitrogen responsive transcriptome in potato (Solanum tuberosum L.) reveals significant gene regulatory motifs.

    Science.gov (United States)

    Gálvez, José Héctor; Tai, Helen H; Lagüe, Martin; Zebarth, Bernie J; Strömvik, Martina V

    2016-05-19

    Nitrogen (N) is the most important nutrient for the growth of potato (Solanum tuberosum L.). Foliar gene expression in potato plants with and without N supplementation at 180 kg N ha(-1) was compared at mid-season. Genes with consistent differences in foliar expression due to N supplementation over three cultivars and two developmental time points were examined. In total, thirty genes were found to be over-expressed and nine genes were found to be under-expressed with supplemented N. Functional relationships between over-expressed genes were found. The main metabolic pathway represented among differentially expressed genes was amino acid metabolism. The 1000 bp upstream flanking regions of the differentially expressed genes were analysed and nine overrepresented motifs were found using three motif discovery algorithms (Seeder, Weeder and MEME). These results point to coordinated gene regulation at the transcriptional level controlling steady state potato responses to N sufficiency.

  10. Nitrate-induced genes in tomato roots. Array analysis reveals novel genes that may play a role in nitrogen nutrition.

    Science.gov (United States)

    Wang, Y H; Garvin, D F; Kochian, L V

    2001-09-01

    A subtractive tomato (Lycopersicon esculentum) root cDNA library enriched in genes up-regulated by changes in plant mineral status was screened with labeled mRNA from roots of both nitrate-induced and mineral nutrient-deficient (-nitrogen [N], -phosphorus, -potassium [K], -sulfur, -magnesium, -calcium, -iron, -zinc, and -copper) tomato plants. A subset of cDNAs was selected from this library based on mineral nutrient-related changes in expression. Additional cDNAs were selected from a second mineral-deficient tomato root library based on sequence homology to known genes. These selection processes yielded a set of 1,280 mineral nutrition-related cDNAs that were arrayed on nylon membranes for further analysis. These high-density arrays were hybridized with mRNA from tomato plants exposed to nitrate at different time points after N was withheld for 48 h, for plants that were grown on nitrate/ammonium for 5 weeks prior to the withholding of N. One hundred-fifteen genes were found to be up-regulated by nitrate resupply. Among these genes were several previously identified as nitrate responsive, including nitrate transporters, nitrate and nitrite reductase, and metabolic enzymes such as transaldolase, transketolase, malate dehydrogenase, asparagine synthetase, and histidine decarboxylase. We also identified 14 novel nitrate-inducible genes, including: (a) water channels, (b) root phosphate and K(+) transporters, (c) genes potentially involved in transcriptional regulation, (d) stress response genes, and (e) ribosomal protein genes. In addition, both families of nitrate transporters were also found to be inducible by phosphate, K, and iron deficiencies. The identification of these novel nitrate-inducible genes is providing avenues of research that will yield new insights into the molecular basis of plant N nutrition, as well as possible networking between the regulation of N, phosphorus, and K nutrition.

  11. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  12. Toward semantic interoperability with linked foundational ontologies in ROMULUS

    CSIR Research Space (South Africa)

    Khan, ZC

    2013-06-01

    Full Text Available A purpose of a foundational ontology is to solve interoperability issues among ontologies. Many foundational ontologies have been developed, reintroducing the ontology interoperability problem. We address this with the new online foundational...

  13. Drug target ontology to classify and integrate drug discovery data.

    Science.gov (United States)

    Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande; Turner, John Paul; Vidovic, Dusica; Forlin, Michele; Koleti, Amar; Nguyen, Dac-Trung; Jensen, Lars Juhl; Guha, Rajarshi; Mathias, Stephen L; Ursu, Oleg; Stathias, Vasileios; Duan, Jianbin; Nabizadeh, Nooshin; Chung, Caty; Mader, Christopher; Visser, Ubbo; Yang, Jeremy J; Bologa, Cristian G; Oprea, Tudor I; Schürer, Stephan C

    2017-11-09

    model for druggable targets including various related information such as protein, gene, protein domain, protein structure, binding site, small molecule drug, mechanism of action, protein tissue localization, disease association, and many other types of information. DTO will further facilitate the otherwise challenging integration and formal linking to biological assays, phenotypes, disease models, drug poly-pharmacology, binding kinetics and many other processes, functions and qualities that are at the core of drug discovery. The first version of DTO is publically available via the website http://drugtargetontology.org/ , Github ( http://github.com/DrugTargetOntology/DTO ), and the NCBO Bioportal ( http://bioportal.bioontology.org/ontologies/DTO ). The long-term goal of DTO is to provide such an integrative framework and to populate the ontology with this information as a community resource.

  14. Gene expression profile analysis of Ligon lintless-1 (Li1) mutant reveals important genes and pathways in cotton leaf and fiber development.

    Science.gov (United States)

    Ding, Mingquan; Jiang, Yurong; Cao, Yuefen; Lin, Lifeng; He, Shae; Zhou, Wei; Rong, Junkang

    2014-02-10

    Ligon lintless-1 (Li1) is a monogenic dominant mutant of Gossypium hirsutum (upland cotton) with a phenotype of impaired vegetative growth and short lint fibers. Despite years of research involving genetic mapping and gene expression profile analysis of Li1 mutant ovule tissues, the gene remains uncloned and the underlying pathway of cotton fiber elongation is still unclear. In this study, we report the whole genome-level deep-sequencing analysis of leaf tissues of the Li1 mutant. Differentially expressed genes in leaf tissues of mutant versus wild-type (WT) plants are identified, and the underlying pathways and potential genes that control leaf and fiber development are inferred. The results show that transcription factors AS2, YABBY5, and KANDI-like are significantly differentially expressed in mutant tissues compared with WT ones. Interestingly, several fiber development-related genes are found in the downregulated gene list of the mutant leaf transcriptome. These genes include heat shock protein family, cytoskeleton arrangement, cell wall synthesis, energy, H2O2 metabolism-related genes, and WRKY transcription factors. This finding suggests that the genes are involved in leaf morphology determination and fiber elongation. The expression data are also compared with the previously published microarray data of Li1 ovule tissues. Comparative analysis of the ovule transcriptomes of Li1 and WT reveals that a number of pathways important for fiber elongation are enriched in the downregulated gene list at different fiber development stages (0, 6, 9, 12, 15, 18dpa). Differentially expressed genes identified in both leaf and fiber samples are aligned with cotton whole genome sequences and combined with the genetic fine mapping results to identify a list of candidate genes for Li1. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. Genomewide Analysis of Aryl Hydrocarbon Receptor Binding Targets Reveals an Extensive Array of Gene Clusters that Control Morphogenetic and Developmental Programs

    Science.gov (United States)

    Sartor, Maureen A.; Schnekenburger, Michael; Marlowe, Jennifer L.; Reichard, John F.; Wang, Ying; Fan, Yunxia; Ma, Ci; Karyala, Saikumar; Halbleib, Danielle; Liu, Xiangdong; Medvedovic, Mario; Puga, Alvaro

    2009-01-01

    Background The vertebrate aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that regulates cellular responses to environmental polycyclic and halogenated compounds. The naive receptor is believed to reside in an inactive cytosolic complex that translocates to the nucleus and induces transcription of xenobiotic detoxification genes after activation by ligand. Objectives We conducted an integrative genomewide analysis of AHR gene targets in mouse hepatoma cells and determined whether AHR regulatory functions may take place in the absence of an exogenous ligand. Methods The network of AHR-binding targets in the mouse genome was mapped through a multipronged approach involving chromatin immunoprecipitation/chip and global gene expression signatures. The findings were integrated into a prior functional knowledge base from Gene Ontology, interaction networks, Kyoto Encyclopedia of Genes and Genomes pathways, sequence motif analysis, and literature molecular concepts. Results We found the naive receptor in unstimulated cells bound to an extensive array of gene clusters with functions in regulation of gene expression, differentiation, and pattern specification, connecting multiple morphogenetic and developmental programs. Activation by the ligand displaced the receptor from some of these targets toward sites in the promoters of xenobiotic metabolism genes. Conclusions The vertebrate AHR appears to possess unsuspected regulatory functions that may be potential targets of environmental injury. PMID:19654925

  16. Embryonic stem cell-like features of testicular carcinoma in situ revealed by genome-wide gene expression profiling

    DEFF Research Database (Denmark)

    Almstrup, Kristian; Hoei-Hansen, Christina E; Wirkner, Ute

    2004-01-01

    in their stoichiometry on progression into embryonic carcinoma. We compared the CIS expression profile with patterns reported in embryonic stem cells (ESCs), which revealed a substantial overlap that may be as high as 50%. We also demonstrated an over-representation of expressed genes in regions of 17q and 12, reported......Carcinoma in situ (CIS) is the common precursor of histologically heterogeneous testicular germ cell tumors (TGCTs), which in recent decades have markedly increased and now are the most common malignancy of young men. Using genome-wide gene expression profiling, we identified >200 genes highly...

  17. Complex Topographic Feature Ontology Patterns

    Science.gov (United States)

    Varanka, Dalia E.; Jerris, Thomas J.

    2015-01-01

    Semantic ontologies are examined as effective data models for the representation of complex topographic feature types. Complex feature types are viewed as integrated relations between basic features for a basic purpose. In the context of topographic science, such component assemblages are supported by resource systems and found on the local landscape. Ontologies are organized within six thematic modules of a domain ontology called Topography that includes within its sphere basic feature types, resource systems, and landscape types. Context is constructed not only as a spatial and temporal setting, but a setting also based on environmental processes. Types of spatial relations that exist between components include location, generative processes, and description. An example is offered in a complex feature type ‘mine.’ The identification and extraction of complex feature types are an area for future research.

  18. Geographic Ontologies, Gazetteers and Multilingualism

    Directory of Open Access Journals (Sweden)

    Robert Laurini

    2015-01-01

    Full Text Available Different languages imply different visions of space, so that terminologies are different in geographic ontologies. In addition to their geometric shapes, geographic features have names, sometimes different in diverse languages. In addition, the role of gazetteers, as dictionaries of place names (toponyms, is to maintain relations between place names and location. The scope of geographic information retrieval is to search for geographic information not against a database, but against the whole Internet: but the Internet stores information in different languages, and it is of paramount importance not to remain stuck to a unique language. In this paper, our first step is to clarify the links between geographic objects as computer representations of geographic features, ontologies and gazetteers designed in various languages. Then, we propose some inference rules for matching not only types, but also relations in geographic ontologies with the assistance of gazetteers.

  19. Ontology Matching with Semantic Verification.

    Science.gov (United States)

    Jean-Mary, Yves R; Shironoshita, E Patrick; Kabuka, Mansur R

    2009-09-01

    ASMOV (Automated Semantic Matching of Ontologies with Verification) is a novel algorithm that uses lexical and structural characteristics of two ontologies to iteratively calculate a similarity measure between them, derives an alignment, and then verifies it to ensure that it does not contain semantic inconsistencies. In this paper, we describe the ASMOV algorithm, and then present experimental results that measure its accuracy using the OAEI 2008 tests, and that evaluate its use with two different thesauri: WordNet, and the Unified Medical Language System (UMLS). These results show the increased accuracy obtained by combining lexical, structural and extensional matchers with semantic verification, and demonstrate the advantage of using a domain-specific thesaurus for the alignment of specialized ontologies.

  20. Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

    Science.gov (United States)

    He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

    2017-01-01

    The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF

  1. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  2. Gene Expression Analysis Reveals New Possible Mechanisms of Vancomycin-Induced Nephrotoxicity and Identifies Gene Markers Candidates

    OpenAIRE

    Dieterich, Christine; Puey, Angela; Lyn, Sylvia; Swezey, Robert; Furimsky, Anna; Fairchild, David; Mirsalis, Jon C.; Ng, Hanna H.

    2008-01-01

    Vancomycin, one of few effective treatments against methicillin-resistant Staphylococcus aureus, is nephrotoxic. The goals of this study were to (1) gain insights into molecular mechanisms of nephrotoxicity at the genomic level, (2) evaluate gene markers of vancomycin-induced kidney injury, and (3) compare gene expression responses after iv and ip administration. Groups of six female BALB/c mice were treated with seven daily iv or ip doses of vancomycin (50, 200, and 400 mg/kg) or saline, and...

  3. Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.

    Directory of Open Access Journals (Sweden)

    Clive H Glover

    2006-11-01

    Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.

  4. Sequencing analysis reveals a unique gene organization in the gyrB region of Mycoplasma hominis

    DEFF Research Database (Denmark)

    Ladefoged, Søren; Christiansen, Gunna

    1994-01-01

    of which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene.......The homolog of the gyrB gene, which has been reported to be present in the vicinity of the initiation site of replication in bacteria, was mapped on the Mycoplasma hominis genome, and the region was subsequently sequenced. Five open reading frames were identified flanking the gyrB gene, one...

  5. Gene expression profiling of mucolipidosis type IV fibroblasts reveals deregulation of genes with relevant functions in lysosome physiology.

    Science.gov (United States)

    Bozzato, Andrea; Barlati, Sergio; Borsani, Giuseppe

    2008-04-01

    Mucolipidosis type IV (MLIV, MIM 252650) is an autosomal recessive lysosomal storage disorder that causes mental and motor retardation as well as visual impairment. The lysosomal storage defect in MLIV is consistent with abnormalities of membrane traffic and organelle dynamics in the late endocytic pathway. MLIV is caused by mutations in the MCOLN1 gene, which codes for mucolipin-1 (MLN1), a member of the large family of transient receptor potential (TRP) cation channels. Although a number of studies have been performed on mucolipin-1, the pathological mechanisms underlying MLIV are not fully understood. To identify genes that characterize pathogenic changes in mucolipidosis type IV, we compared the expression profiles of three MLIV and three normal skin fibroblasts cell lines using oligonucleotide microarrays. Genes that were differentially expressed in patients' cells were identified. 231 genes were up-regulated, and 116 down-regulated. Real-Time RT-PCR performed on selected genes in six independent MLIV fibroblasts cell lines was generally consistent with the microarray findings. This study allowed to evidence the modulation at the transcriptional level of a discrete number of genes relevant in biological processes which are altered in the disease such as endosome/lysosome trafficking, lysosome biogenesis, organelle acidification and lipid metabolism.

  6. An ontology approach to comparative phenomics in plants

    KAUST Repository

    Oellrich, Anika

    2015-02-25

    Background: Plant phenotype datasets include many different types of data, formats, and terms from specialized vocabularies. Because these datasets were designed for different audiences, they frequently contain language and details tailored to investigators with different research objectives and backgrounds. Although phenotype comparisons across datasets have long been possible on a small scale, comprehensive queries and analyses that span a broad set of reference species, research disciplines, and knowledge domains continue to be severely limited by the absence of a common semantic framework. Results: We developed a workflow to curate and standardize existing phenotype datasets for six plant species, encompassing both model species and crop plants with established genetic resources. Our effort focused on mutant phenotypes associated with genes of known sequence in Arabidopsis thaliana (L.) Heynh. (Arabidopsis), Zea mays L. subsp. mays (maize), Medicago truncatula Gaertn. (barrel medic or Medicago), Oryza sativa L. (rice), Glycine max (L.) Merr. (soybean), and Solanum lycopersicum L. (tomato). We applied the same ontologies, annotation standards, formats, and best practices across all six species, thereby ensuring that the shared dataset could be used for cross-species querying and semantic similarity analyses. Curated phenotypes were first converted into a common format using taxonomically broad ontologies such as the Plant Ontology, Gene Ontology, and Phenotype and Trait Ontology. We then compared ontology-based phenotypic descriptions with an existing classification system for plant phenotypes and evaluated our semantic similarity dataset for its ability to enhance predictions of gene families, protein functions, and shared metabolic pathways that underlie informative plant phenotypes. Conclusions: The use of ontologies, annotation standards, shared formats, and best practices for cross-taxon phenotype data analyses represents a novel approach to plant phenomics

  7. An ontology approach to comparative phenomics in plants

    KAUST Repository

    Oellrich, Anika; Walls, Ramona L; Cannon, Ethalinda KS; Cannon, Steven B; Cooper, Laurel; Gardiner, Jack; Gkoutos, Georgios V; Harper, Lisa; He, Mingze; Hoehndorf, Robert; Jaiswal, Pankaj; Kalberer, Scott R; Lloyd, John P; Meinke, David; Menda, Naama; Moore, Laura; Nelson, Rex T; Pujar, Anuradha; Lawrence, Carolyn J; Huala, Eva

    2015-01-01

    Background: Plant phenotype datasets include many different types of data, formats, and terms from specialized vocabularies. Because these datasets were designed for different audiences, they frequently contain language and details tailored to investigators with different research objectives and backgrounds. Although phenotype comparisons across datasets have long been possible on a small scale, comprehensive queries and analyses that span a broad set of reference species, research disciplines, and knowledge domains continue to be severely limited by the absence of a common semantic framework. Results: We developed a workflow to curate and standardize existing phenotype datasets for six plant species, encompassing both model species and crop plants with established genetic resources. Our effort focused on mutant phenotypes associated with genes of known sequence in Arabidopsis thaliana (L.) Heynh. (Arabidopsis), Zea mays L. subsp. mays (maize), Medicago truncatula Gaertn. (barrel medic or Medicago), Oryza sativa L. (rice), Glycine max (L.) Merr. (soybean), and Solanum lycopersicum L. (tomato). We applied the same ontologies, annotation standards, formats, and best practices across all six species, thereby ensuring that the shared dataset could be used for cross-species querying and semantic similarity analyses. Curated phenotypes were first converted into a common format using taxonomically broad ontologies such as the Plant Ontology, Gene Ontology, and Phenotype and Trait Ontology. We then compared ontology-based phenotypic descriptions with an existing classification system for plant phenotypes and evaluated our semantic similarity dataset for its ability to enhance predictions of gene families, protein functions, and shared metabolic pathways that underlie informative plant phenotypes. Conclusions: The use of ontologies, annotation standards, shared formats, and best practices for cross-taxon phenotype data analyses represents a novel approach to plant phenomics

  8. Linking human diseases to animal models using ontology-based phenotype annotation.

    Directory of Open Access Journals (Sweden)

    Nicole L Washington

    2009-11-01

    Full Text Available Scientists and clinicians who study genetic alterations and disease have traditionally described phenotypes in natural language. The considerable variation in these free-text descriptions has posed a hindrance to the important task of identifying candidate genes and models for human diseases and indicates the need for a computationally tractable method to mine data resources for mutant phenotypes. In this study, we tested the hypothesis that ontological annotation of disease phenotypes will facilitate the discovery of new genotype-phenotype relationships within and across species. To describe phenotypes using on