WorldWideScience

Sample records for facilitate gene discovery

  1. GWATCH: a web platform for automated gene association discovery analysis

    Science.gov (United States)

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  2. Developing integrated crop knowledge networks to advance candidate gene discovery.

    Science.gov (United States)

    Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

    2016-12-01

    The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.

  3. Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes.

    Science.gov (United States)

    Hassani-Pak, Keywan; Rawlings, Christopher

    2017-06-13

    Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.

  4. Repurposed transcriptomic data facilitate discovery of innate immunity toll-like receptor (TLR) Genes across Lophotrochozoa.

    Science.gov (United States)

    Halanych, Kenneth M; Kocot, Kevin M

    2014-10-01

    The growing volume of genomic data from across life represents opportunities for deriving valuable biological information from data that were initially collected for another purpose. Here, we use transcriptomes collected for phylogenomic studies to search for toll-like receptor (TLR) genes in poorly sampled lophotrochozoan clades (Annelida, Mollusca, Brachiopoda, Phoronida, and Entoprocta) and one ecdysozoan clade (Priapulida). TLR genes are involved in innate immunity across animals by recognizing potential microbial infection. They have an extracellular leucine-rich repeat (LRR) domain connected to a transmembrane domain and an intracellular toll/interleukin-1 receptor (TIR) domain. Consequently, these genes are important in initiating a signaling pathway to trigger defense. We found at least one TLR ortholog in all but two taxa examined, suggesting that a broad array of lophotrochozoans may have innate immune systems similar to those observed in vertebrates and arthropods. Comparison to the SMART database confirmed the presence of both the LRR and the TIR protein motifs characteristic of TLR genes. Because we looked at only one transcriptome per species, discovery of TLR genes was limited for most taxa. However, several TRL-like genes that vary in the number and placement of LRR domains were found in phoronids. Additionally, several contigs contained LRR domains but lacked TIR domains, suggesting they were not TLRs. Many of these LRR-containing contigs had other domains (e.g., immunoglobin) and are likely involved in innate immunity. © 2014 Marine Biological Laboratory.

  5. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Directory of Open Access Journals (Sweden)

    Oelofse Dean

    2010-04-01

    Full Text Available Abstract Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L. Walp. We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i to normalize the data effectively using spike-in control spot normalization, and (ii to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped

  6. A brief history of Alzheimer's disease gene discovery.

    Science.gov (United States)

    Tanzi, Rudolph E

    2013-01-01

    The rich and colorful history of gene discovery in Alzheimer's disease (AD) over the past three decades is as complex and heterogeneous as the disease, itself. Twin and family studies indicate that genetic factors are estimated to play a role in at least 80% of AD cases. The inheritance of AD exhibits a dichotomous pattern. On one hand, rare mutations inAPP, PSEN1, and PSEN2 are fully penetrant for early-onset (95%) late-onset AD. These four genes account for 30-50% of the inheritability of AD. Genome-wide association studies have recently led to the identification of additional highly confirmed AD candidate genes. Here, I review the past, present, and future of attempts to elucidate the complex and heterogeneous genetic underpinnings of AD along with some of the unique events that made these discoveries possible.

  7. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  8. Biomarker discovery for colon cancer using a 761 gene RT-PCR assay

    Directory of Open Access Journals (Sweden)

    Hackett James R

    2007-08-01

    Full Text Available Abstract Background Reverse transcription PCR (RT-PCR is widely recognized to be the gold standard method for quantifying gene expression. Studies using RT-PCR technology as a discovery tool have historically been limited to relatively small gene sets compared to other gene expression platforms such as microarrays. We have recently shown that TaqMan® RT-PCR can be scaled up to profile expression for 192 genes in fixed paraffin-embedded (FPE clinical study tumor specimens. This technology has also been used to develop and commercialize a widely used clinical test for breast cancer prognosis and prediction, the Onco typeDX™ assay. A similar need exists in colon cancer for a test that provides information on the likelihood of disease recurrence in colon cancer (prognosis and the likelihood of tumor response to standard chemotherapy regimens (prediction. We have now scaled our RT-PCR assay to efficiently screen 761 biomarkers across hundreds of patient samples and applied this process to biomarker discovery in colon cancer. This screening strategy remains attractive due to the inherent advantages of maintaining platform consistency from discovery through clinical application. Results RNA was extracted from formalin fixed paraffin embedded (FPE tissue, as old as 28 years, from 354 patients enrolled in NSABP C-01 and C-02 colon cancer studies. Multiplexed reverse transcription reactions were performed using a gene specific primer pool containing 761 unique primers. PCR was performed as independent TaqMan® reactions for each candidate gene. Hierarchal clustering demonstrates that genes expected to co-express form obvious, distinct and in certain cases very tightly correlated clusters, validating the reliability of this technical approach to biomarker discovery. Conclusion We have developed a high throughput, quantitatively precise multi-analyte gene expression platform for biomarker discovery that approaches low density DNA arrays in numbers of

  9. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation

    Science.gov (United States)

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F.; Statnikov, Alexander

    2016-01-01

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods’ performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost. PMID:26939894

  10. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  11. Discovery of Cationic Polymers for Non-viral Gene Delivery using Combinatorial Approaches

    Science.gov (United States)

    Barua, Sutapa; Ramos, James; Potta, Thrimoorthy; Taylor, David; Huang, Huang-Chiao; Montanez, Gabriela; Rege, Kaushal

    2015-01-01

    Gene therapy is an attractive treatment option for diseases of genetic origin, including several cancers and cardiovascular diseases. While viruses are effective vectors for delivering exogenous genes to cells, concerns related to insertional mutagenesis, immunogenicity, lack of tropism, decay and high production costs necessitate the discovery of non-viral methods. Significant efforts have been focused on cationic polymers as non-viral alternatives for gene delivery. Recent studies have employed combinatorial syntheses and parallel screening methods for enhancing the efficacy of gene delivery, biocompatibility of the delivery vehicle, and overcoming cellular level barriers as they relate to polymer-mediated transgene uptake, transport, transcription, and expression. This review summarizes and discusses recent advances in combinatorial syntheses and parallel screening of cationic polymer libraries for the discovery of efficient and safe gene delivery systems. PMID:21843141

  12. Comprehensive Clinical Phenotyping and Genetic Mapping for the Discovery of Autism Susceptibility Genes

    Science.gov (United States)

    2013-03-14

    behavioral teaching strategies and best practice for teaching students with autism spectrum disorders 4.52 Learn strategies for incorporating IEP goals...AFRL-SA-WP-TR-2013-0013 Comprehensive Clinical Phenotyping and Genetic Mapping for the Discovery of Autism Susceptibility Genes...Genetic Mapping for the Discovery of Autism Susceptibility Genes 5a. CONTRACT NUMBER N/A 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER N/A 6

  13. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  14. Cross-pollination of research findings, although uncommon, may accelerate discovery of human disease genes

    Directory of Open Access Journals (Sweden)

    Duda Marlena

    2012-11-01

    Full Text Available Abstract Background Technological leaps in genome sequencing have resulted in a surge in discovery of human disease genes. These discoveries have led to increased clarity on the molecular pathology of disease and have also demonstrated considerable overlap in the genetic roots of human diseases. In light of this large genetic overlap, we tested whether cross-disease research approaches lead to faster, more impactful discoveries. Methods We leveraged several gene-disease association databases to calculate a Mutual Citation Score (MCS for 10,853 pairs of genetically related diseases to measure the frequency of cross-citation between research fields. To assess the importance of cooperative research, we computed an Individual Disease Cooperation Score (ICS and the average publication rate for each disease. Results For all disease pairs with one gene in common, we found that the degree of genetic overlap was a poor predictor of cooperation (r2=0.3198 and that the vast majority of disease pairs (89.56% never cited previous discoveries of the same gene in a different disease, irrespective of the level of genetic similarity between the diseases. A fraction (0.25% of the pairs demonstrated cross-citation in greater than 5% of their published genetic discoveries and 0.037% cross-referenced discoveries more than 10% of the time. We found strong positive correlations between ICS and publication rate (r2=0.7931, and an even stronger correlation between the publication rate and the number of cross-referenced diseases (r2=0.8585. These results suggested that cross-disease research may have the potential to yield novel discoveries at a faster pace than singular disease research. Conclusions Our findings suggest that the frequency of cross-disease study is low despite the high level of genetic similarity among many human diseases, and that collaborative methods may accelerate and increase the impact of new genetic discoveries. Until we have a better

  15. Biomarker Gene Signature Discovery Integrating Network Knowledge

    Directory of Open Access Journals (Sweden)

    Holger Fröhlich

    2012-02-01

    Full Text Available Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

  16. Generation of comprehensive transposon insertion mutant library for the model archaeon, Haloferax volcanii, and its use for gene discovery.

    Science.gov (United States)

    Kiljunen, Saija; Pajunen, Maria I; Dilks, Kieran; Storf, Stefanie; Pohlschroder, Mechthild; Savilahti, Harri

    2014-12-09

    Archaea share fundamental properties with bacteria and eukaryotes. Yet, they also possess unique attributes, which largely remain poorly characterized. Haloferax volcanii is an aerobic, moderately halophilic archaeon that can be grown in defined media. It serves as an excellent archaeal model organism to study the molecular mechanisms of biological processes and cellular responses to changes in the environment. Studies on haloarchaea have been impeded by the lack of efficient genetic screens that would facilitate the identification of protein functions and respective metabolic pathways. Here, we devised an insertion mutagenesis strategy that combined Mu in vitro DNA transposition and homologous-recombination-based gene targeting in H. volcanii. We generated an insertion mutant library, in which the clones contained a single genomic insertion. From the library, we isolated pigmentation-defective and auxotrophic mutants, and the respective insertions pinpointed a number of genes previously known to be involved in carotenoid and amino acid biosynthesis pathways, thus validating the performance of the methodologies used. We also identified mutants that had a transposon insertion in a gene encoding a protein of unknown or putative function, demonstrating that novel roles for non-annotated genes could be assigned. We have generated, for the first time, a random genomic insertion mutant library for a halophilic archaeon and used it for efficient gene discovery. The library will facilitate the identification of non-essential genes behind any specific biochemical pathway. It represents a significant step towards achieving a more complete understanding of the unique characteristics of halophilic archaea.

  17. iSyTE 2.0: a database for expression-based gene discovery in the eye

    Science.gov (United States)

    Kakrana, Atul; Yang, Andrian; Anand, Deepti; Djordjevic, Djordje; Ramachandruni, Deepti; Singh, Abhyudai; Huang, Hongzhan

    2018-01-01

    Abstract Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.bioinformatics.udel.edu/iSyTE) based on well-curated, comprehensive genome-level lens expression data as a one-stop portal for the effective visualization and analysis of candidate genes in lens development and disease. iSyTE 2.0 includes all publicly available lens Affymetrix and Illumina microarray datasets representing a broad range of embryonic and postnatal stages from wild-type and specific gene-perturbation mouse mutants with eye defects. Further, we developed a new user-friendly web interface for direct access and cogent visualization of the curated expression data, which supports convenient searches and a range of downstream analyses. The utility of these new iSyTE 2.0 features is illustrated through examples of established genes associated with lens development and pathobiology, which serve as tutorials for its application by the end-user. iSyTE 2.0 will facilitate the prioritization of eye development and disease-linked candidate genes in studies involving transcriptomics or next-generation sequencing data, linkage analysis and GWAS approaches. PMID:29036527

  18. Use of Heuristics to Facilitate Scientific Discovery Learning in a Simulation Learning Environment in a Physics Domain

    Science.gov (United States)

    Veermans, Koen; van Joolingen, Wouter; de Jong, Ton

    2006-01-01

    This article describes a study into the role of heuristic support in facilitating discovery learning through simulation-based learning. The study compares the use of two such learning environments in the physics domain of collisions. In one learning environment (implicit heuristics) heuristics are only used to provide the learner with guidance…

  19. Computational and Experimental Approaches to Cancer Biomarker Discovery

    DEFF Research Database (Denmark)

    Krzystanek, Marcin

    of a patient’s response to a particular treatment, thus helping to avoid unnecessary treatment and unwanted side effects in non-responding individuals.Currently biomarker discovery is facilitated by recent advances in high-throughput technologies when association between a given biological phenotype...... and the state or level of a large number of molecular entities is investigated. Such associative analysis could be confounded by several factors, leading to false discoveries. For example, it is assumed that with the exception of the true biomarkers most molecular entities such as gene expression levels show...... random distribution in a given cohort. However, gene expression levels may also be affected by technical bias when the actual measurement technology or sample handling may introduce a systematic error. If the distribution of systematic errors correlates with the biological phenotype then the risk...

  20. Watershed and Economic Data InterOperability (WEDO): Facilitating Discovery, Evaluation and Integration through the Sharing of Watershed Modeling Data

    Science.gov (United States)

    Watershed and Economic Data InterOperability (WEDO) is a system of information technologies designed to publish watershed modeling studies for reuse. WEDO facilitates three aspects of interoperability: discovery, evaluation and integration of data. This increased level of interop...

  1. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  2. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    Science.gov (United States)

    Roffler, Gretchen H.; Amish, Stephen J.; Smith, Seth; Cosart, Ted F.; Kardos, Marty; Schwartz, Michael K.; Luikart, Gordon

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

  3. Discovery of cancer common and specific driver gene sets

    Science.gov (United States)

    2017-01-01

    Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295

  4. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    Science.gov (United States)

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.

  5. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships.

    Science.gov (United States)

    Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong

    2010-01-18

    The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.

  6. MAGIC Database and Interfaces: An Integrated Package for Gene Discovery and Expression

    Directory of Open Access Journals (Sweden)

    Lee H. Pratt

    2006-03-01

    Full Text Available The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs, and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  7. Usability of Discovery Portals

    NARCIS (Netherlands)

    Bulens, J.D.; Vullings, L.A.E.; Houtkamp, J.M.; Vanmeulebrouk, B.

    2013-01-01

    As INSPIRE progresses to be implemented in the EU, many new discovery portals are built to facilitate finding spatial data. Currently the structure of the discovery portals is determined by the way spatial data experts like to work. However, we argue that the main target group for discovery portals

  8. Gene Fusion Markup Language: a prototype for exchanging gene fusion data.

    Science.gov (United States)

    Kalyana-Sundaram, Shanker; Shanmugam, Achiraman; Chinnaiyan, Arul M

    2012-10-16

    An avalanche of next generation sequencing (NGS) studies has generated an unprecedented amount of genomic structural variation data. These studies have also identified many novel gene fusion candidates with more detailed resolution than previously achieved. However, in the excitement and necessity of publishing the observations from this recently developed cutting-edge technology, no community standardization approach has arisen to organize and represent the data with the essential attributes in an interchangeable manner. As transcriptome studies have been widely used for gene fusion discoveries, the current non-standard mode of data representation could potentially impede data accessibility, critical analyses, and further discoveries in the near future. Here we propose a prototype, Gene Fusion Markup Language (GFML) as an initiative to provide a standard format for organizing and representing the significant features of gene fusion data. GFML will offer the advantage of representing the data in a machine-readable format to enable data exchange, automated analysis interpretation, and independent verification. As this database-independent exchange initiative evolves it will further facilitate the formation of related databases, repositories, and analysis tools. The GFML prototype is made available at http://code.google.com/p/gfml-prototype/. The Gene Fusion Markup Language (GFML) presented here could facilitate the development of a standard format for organizing, integrating and representing the significant features of gene fusion data in an inter-operable and query-able fashion that will enable biologically intuitive access to gene fusion findings and expedite functional characterization. A similar model is envisaged for other NGS data analyses.

  9. TET1-Mediated Hydroxymethylation Facilitates Hypoxic Gene Induction in Neuroblastoma

    Directory of Open Access Journals (Sweden)

    Christopher J. Mariani

    2014-06-01

    Full Text Available The ten-eleven-translocation 5-methylcytosine dioxygenase (TET family of enzymes catalyzes the conversion of 5-methylcytosine (5-mC to 5-hydroxymethylcytosine (5-hmC, a modified cytosine base that facilitates gene expression. Cells respond to hypoxia by inducing a transcriptional program regulated in part by oxygen-dependent dioxygenases that require Fe(II and α-ketoglutarate. Given that the TET enzymes also require these cofactors, we hypothesized that the TETs regulate the hypoxia-induced transcriptional program. Here, we demonstrate that hypoxia increases global 5-hmC levels, with accumulation of 5-hmC density at canonical hypoxia response genes. A subset of 5-hmC gains colocalize with hypoxia response elements facilitating DNA demethylation and HIF binding. Hypoxia results in transcriptional activation of TET1, and full induction of hypoxia-responsive genes and global 5-hmC increases require TET1. Finally, we show that 5-hmC increases and TET1 upregulation in hypoxia are HIF-1 dependent. These findings establish TET1-mediated 5-hmC changes as an important epigenetic component of the hypoxic response.

  10. Culture-independent discovery of natural products from soil metagenomes.

    Science.gov (United States)

    Katz, Micah; Hover, Bradley M; Brady, Sean F

    2016-03-01

    Bacterial natural products have proven to be invaluable starting points in the development of many currently used therapeutic agents. Unfortunately, traditional culture-based methods for natural product discovery have been deemphasized by pharmaceutical companies due in large part to high rediscovery rates. Culture-independent, or "metagenomic," methods, which rely on the heterologous expression of DNA extracted directly from environmental samples (eDNA), have the potential to provide access to metabolites encoded by a large fraction of the earth's microbial biosynthetic diversity. As soil is both ubiquitous and rich in bacterial diversity, it is an appealing starting point for culture-independent natural product discovery efforts. This review provides an overview of the history of soil metagenome-driven natural product discovery studies and elaborates on the recent development of new tools for sequence-based, high-throughput profiling of environmental samples used in discovering novel natural product biosynthetic gene clusters. We conclude with several examples of these new tools being employed to facilitate the recovery of novel secondary metabolite encoding gene clusters from soil metagenomes and the subsequent heterologous expression of these clusters to produce bioactive small molecules.

  11. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes.

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min A; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C; Ivanova, Natalia N

    2017-01-04

    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Facilitating functional annotation of chicken microarray data

    Directory of Open Access Journals (Sweden)

    Gresham Cathy R

    2009-10-01

    Full Text Available Abstract Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO. However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and

  13. Recombination facilitates neofunctionalization of duplicate genes via originalization

    Directory of Open Access Journals (Sweden)

    Huang Ren

    2010-06-01

    Full Text Available Abstract Background Recently originalization was proposed to be an effective way of duplicate-gene preservation, in which recombination provokes the high frequency of original (or wild-type allele on both duplicated loci. Because the high frequency of wild-type allele might drive the arising and accumulating of advantageous mutation, it is hypothesized that recombination might enlarge the probability of neofunctionalization (Pneo of duplicate genes. In this article this hypothesis has been tested theoretically. Results Results show that through originalization recombination might not only shorten mean time to neofunctionalizaiton, but also enlarge Pneo. Conclusions Therefore, recombination might facilitate neofunctionalization via originalization. Several extensive applications of these results on genomic evolution have been discussed: 1. Time to nonfunctionalization can be much longer than a few million generations expected before; 2. Homogenization on duplicated loci results from not only gene conversion, but also originalization; 3. Although the rate of advantageous mutation is much small compared with that of degenerative mutation, Pneo cannot be expected to be small.

  14. Usability of Discovery Portals

    OpenAIRE

    Bulens, J.D.; Vullings, L.A.E.; Houtkamp, J.M.; Vanmeulebrouk, B.

    2013-01-01

    As INSPIRE progresses to be implemented in the EU, many new discovery portals are built to facilitate finding spatial data. Currently the structure of the discovery portals is determined by the way spatial data experts like to work. However, we argue that the main target group for discovery portals are not spatial data experts but professionals with limited spatial knowledge, and a focus outside the spatial domain. An exploratory usability experiment was carried out in which three discovery p...

  15. Classification of genes and putative biomarker identification using distribution metrics on expression profiles.

    Directory of Open Access Journals (Sweden)

    Hung-Chung Huang

    Full Text Available BACKGROUND: Identification of genes with switch-like properties will facilitate discovery of regulatory mechanisms that underlie these properties, and will provide knowledge for the appropriate application of Boolean networks in gene regulatory models. As switch-like behavior is likely associated with tissue-specific expression, these gene products are expected to be plausible candidates as tissue-specific biomarkers. METHODOLOGY/PRINCIPAL FINDINGS: In a systematic classification of genes and search for biomarkers, gene expression profiles (GEPs of more than 16,000 genes from 2,145 mouse array samples were analyzed. Four distribution metrics (mean, standard deviation, kurtosis and skewness were used to classify GEPs into four categories: predominantly-off, predominantly-on, graded (rheostatic, and switch-like genes. The arrays under study were also grouped and examined by tissue type. For example, arrays were categorized as 'brain group' and 'non-brain group'; the Kolmogorov-Smirnov distance and Pearson correlation coefficient were then used to compare GEPs between brain and non-brain for each gene. We were thus able to identify tissue-specific biomarker candidate genes. CONCLUSIONS/SIGNIFICANCE: The methodology employed here may be used to facilitate disease-specific biomarker discovery.

  16. Comprehensive Clinical Phenotyping & Genetic Mapping for the Discovery of Autism Susceptibility Genes

    Science.gov (United States)

    2012-12-05

    teaching students with autism spectrum disorders 4.52 Learn strategies for incorporating IEP goals and district standard into daily teaching...W403 Columbus, OH 43205 Final Report Comprehensive Clinical Phenotyping & Genetic Mapping for the Discovery of Autism Susceptibility Genes...QFOXGHDUHDFRGH 1.0 Summary In 2006, the Central Ohio Registry for Autism (CORA) was initiated as a collaboration between Wright-Patterson Air

  17. Canonical correlation analysis for gene-based pleiotropy discovery.

    Directory of Open Access Journals (Sweden)

    Jose A Seoane

    2014-10-01

    Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

  18. An incoherent feedforward loop facilitates adaptive tuning of gene expression.

    Science.gov (United States)

    Hong, Jungeui; Brandt, Nathan; Abdul-Rahman, Farah; Yang, Ally; Hughes, Tim; Gresham, David

    2018-04-05

    We studied adaptive evolution of gene expression using long-term experimental evolution of Saccharomyces cerevisiae in ammonium-limited chemostats. We found repeated selection for non-synonymous variation in the DNA binding domain of the transcriptional activator, GAT1, which functions with the repressor, DAL80 in an incoherent type-1 feedforward loop (I1-FFL) to control expression of the high affinity ammonium transporter gene, MEP2. Missense mutations in the DNA binding domain of GAT1 reduce its binding to the GATAA consensus sequence. However, we show experimentally, and using mathematical modeling, that decreases in GAT1 binding result in increased expression of MEP2 as a consequence of properties of I1-FFLs. Our results show that I1-FFLs, one of the most commonly occurring network motifs in transcriptional networks, can facilitate adaptive tuning of gene expression through modulation of transcription factor binding affinities. Our findings highlight the importance of gene regulatory architectures in the evolution of gene expression. © 2018, Hong et al.

  19. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    Science.gov (United States)

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  20. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001; FINAL

    International Nuclear Information System (INIS)

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-01-01

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community

  1. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing

    DEFF Research Database (Denmark)

    Pang, Chi; Tay, Aidan; Aya, Carlos

    2014-01-01

    contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates...... the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene...

  2. Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.

    Science.gov (United States)

    Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis

    2014-12-01

    Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Exome sequencing for gene discovery in lethal fetal disorders--harnessing the value of extreme phenotypes.

    Science.gov (United States)

    Filges, Isabel; Friedman, Jan M

    2015-10-01

    Massively parallel sequencing has revolutionized our understanding of Mendelian disorders, and many novel genes have been discovered to cause disease phenotypes when mutant. At the same time, next-generation sequencing approaches have enabled non-invasive prenatal testing of free fetal DNA in maternal blood. However, little attention has been paid to using whole exome and genome sequencing strategies for gene identification in fetal disorders that are lethal in utero, because they can appear to be sporadic and Mendelian inheritance may be missed. We present challenges and advantages of applying next-generation sequencing approaches to gene discovery in fetal malformation phenotypes and review recent successful discovery approaches. We discuss the implication and significance of recessive inheritance and cross-species phenotyping in fetal lethal conditions. Whole exome sequencing can be used in individual families with undiagnosed lethal congenital anomaly syndromes to discover causal mutations, provided that prior to data analysis, the fetal phenotype can be correlated to a particular developmental pathway in embryogenesis. Cross-species phenotyping allows providing further evidence for causality of discovered variants in genes involved in those extremely rare phenotypes and will increase our knowledge about normal and abnormal human developmental processes. Ultimately, families will benefit from the option of early prenatal diagnosis. © 2014 John Wiley & Sons, Ltd.

  4. Automated discovery of functional generality of human gene expression programs.

    Directory of Open Access Journals (Sweden)

    Georg K Gerber

    2007-08-01

    Full Text Available An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-kappaB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal

  5. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

    Science.gov (United States)

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

    2016-09-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.

  6. DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis

    Directory of Open Access Journals (Sweden)

    Baseler Michael W

    2007-11-01

    Full Text Available Abstract Background Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. Description The DAVID Knowledgebase is built around the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of gene/protein identifiers from a variety of public genomic resources into DAVID gene clusters. The grouping of such identifiers improves the cross-reference capability, particularly across NCBI and UniProt systems, enabling more than 40 publicly available functional annotation sources to be comprehensively integrated and centralized by the DAVID gene clusters. The simple, pair-wise, text format files which make up the DAVID Knowledgebase are freely downloadable for various data analysis uses. In addition, a well organized web interface allows users to query different types of heterogeneous annotations in a high-throughput manner. Conclusion The DAVID Knowledgebase is designed to facilitate high throughput gene functional analysis. For a given gene list, it not only provides the quick accessibility to a wide range of heterogeneous annotation data in a centralized location, but also enriches the level of biological information for an individual gene. Moreover, the entire DAVID Knowledgebase is freely downloadable or searchable at http://david.abcc.ncifcrf.gov/knowledgebase/.

  7. Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data.

    Science.gov (United States)

    Yip, Shun H; Sham, Pak Chung; Wang, Junwen

    2018-02-21

    Traditional RNA sequencing (RNA-seq) allows the detection of gene expression variations between two or more cell populations through differentially expressed gene (DEG) analysis. However, genes that contribute to cell-to-cell differences are not discoverable with RNA-seq because RNA-seq samples are obtained from a mixture of cells. Single-cell RNA-seq (scRNA-seq) allows the detection of gene expression in each cell. With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a homogeneous cell population, such as a population of embryonic stem cells. This analysis is implemented in many software packages. In this study, we compare seven HVG methods from six software packages, including BASiCS, Brennecke, scLVM, scran, scVEGs and Seurat. Our results demonstrate that reproducibility in HVG analysis requires a larger sample size than DEG analysis. Discrepancies between methods and potential issues in these tools are discussed and recommendations are made.

  8. Novel enabling technologies of gene isolation and plant transformation for improved crop protection

    Energy Technology Data Exchange (ETDEWEB)

    Torok, Tamas

    2013-02-04

    Meeting the needs of agricultural producers requires the continued development of improved transgenic crop protection products. The completed project focused on developing novel enabling technologies of gene discovery and plant transformation to facilitate the generation of such products.

  9. Peroxidase gene discovery from the horseradish transcriptome.

    Science.gov (United States)

    Näätsaari, Laura; Krainer, Florian W; Schubert, Michael; Glieder, Anton; Thallinger, Gerhard G

    2014-03-24

    Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes.

  10. Improving Interpretation of Cardiac Phenotypes and Enhancing Discovery With Expanded Knowledge in the Gene Ontology.

    Science.gov (United States)

    Lovering, Ruth C; Roncaglia, Paola; Howe, Douglas G; Laulederkind, Stanley J F; Khodiyar, Varsha K; Berardini, Tanya Z; Tweedie, Susan; Foulger, Rebecca E; Osumi-Sutherland, David; Campbell, Nancy H; Huntley, Rachael P; Talmud, Philippa J; Blake, Judith A; Breckenridge, Ross; Riley, Paul R; Lambiase, Pier D; Elliott, Perry M; Clapp, Lucie; Tinker, Andrew; Hill, David P

    2018-02-01

    A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. © 2018 The Authors.

  11. Identifying candidate driver genes by integrative ovarian cancer genomics data

    Science.gov (United States)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  12. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    Science.gov (United States)

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-10-12

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing.

  13. Representation Discovery using Harmonic Analysis

    CERN Document Server

    Mahadevan, Sridhar

    2008-01-01

    Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu

  14. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  15. The Matchmaker Exchange: a platform for rare disease gene discovery.

    Science.gov (United States)

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow. © 2015 WILEY PERIODICALS, INC.

  16. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    OpenAIRE

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Deng, Zixin; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’....

  17. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  18. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  19. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

    Directory of Open Access Journals (Sweden)

    Saurav Mallik

    2017-12-01

    Full Text Available For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  20. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis.

    Science.gov (United States)

    Mallik, Saurav; Zhao, Zhongming

    2017-12-28

    For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  1. NASA Reverb: Standards-Driven Earth Science Data and Service Discovery

    Science.gov (United States)

    Cechini, M. F.; Mitchell, A.; Pilone, D.

    2011-12-01

    NASA's Earth Observing System Data and Information System (EOSDIS) is a core capability in NASA's Earth Science Data Systems Program. NASA's EOS ClearingHOuse (ECHO) is a metadata catalog for the EOSDIS, providing a centralized catalog of data products and registry of related data services. Working closely with the EOSDIS community, the ECHO team identified a need to develop the next generation EOS data and service discovery tool. This development effort relied on the following principles: + Metadata Driven User Interface - Users should be presented with data and service discovery capabilities based on dynamic processing of metadata describing the targeted data. + Integrated Data & Service Discovery - Users should be able to discovery data and associated data services that facilitate their research objectives. + Leverage Common Standards - Users should be able to discover and invoke services that utilize common interface standards. Metadata plays a vital role facilitating data discovery and access. As data providers enhance their metadata, more advanced search capabilities become available enriching a user's search experience. Maturing metadata formats such as ISO 19115 provide the necessary depth of metadata that facilitates advanced data discovery capabilities. Data discovery and access is not limited to simply the retrieval of data granules, but is growing into the more complex discovery of data services. These services include, but are not limited to, services facilitating additional data discovery, subsetting, reformatting, and re-projecting. The discovery and invocation of these data services is made significantly simpler through the use of consistent and interoperable standards. By utilizing an adopted standard, developing standard-specific adapters can be utilized to communicate with multiple services implementing a specific protocol. The emergence of metadata standards such as ISO 19119 plays a similarly important role in discovery as the 19115 standard

  2. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to

  3. Discovery of new candidate genes for rheumatoid arthritis through integration of genetic association data with expression pathway analysis.

    Science.gov (United States)

    Shchetynsky, Klementy; Diaz-Gallo, Lina-Marcella; Folkersen, Lasse; Hensvold, Aase Haj; Catrina, Anca Irinel; Berg, Louise; Klareskog, Lars; Padyukov, Leonid

    2017-02-02

    Here we integrate verified signals from previous genetic association studies with gene expression and pathway analysis for discovery of new candidate genes and signaling networks, relevant for rheumatoid arthritis (RA). RNA-sequencing-(RNA-seq)-based expression analysis of 377 genes from previously verified RA-associated loci was performed in blood cells from 5 newly diagnosed, non-treated patients with RA, 7 patients with treated RA and 12 healthy controls. Differentially expressed genes sharing a similar expression pattern in treated and untreated RA sub-groups were selected for pathway analysis. A set of "connector" genes derived from pathway analysis was tested for differential expression in the initial discovery cohort and validated in blood cells from 73 patients with RA and in 35 healthy controls. There were 11 qualifying genes selected for pathway analysis and these were grouped into two evidence-based functional networks, containing 29 and 27 additional connector molecules. The expression of genes, corresponding to connector molecules was then tested in the initial RNA-seq data. Differences in the expression of ERBB2, TP53 and THOP1 were similar in both treated and non-treated patients with RA and an additional nine genes were differentially expressed in at least one group of patients compared to healthy controls. The ERBB2, TP53. THOP1 expression profile was successfully replicated in RNA-seq data from peripheral blood mononuclear cells from healthy controls and non-treated patients with RA, in an independent collection of samples. Integration of RNA-seq data with findings from association studies, and consequent pathway analysis implicate new candidate genes, ERBB2, TP53 and THOP1 in the pathogenesis of RA.

  4. Sugar transporter genes of the brown planthopper, Nilaparvata lugens: A facilitated glucose/fructose transporter.

    Science.gov (United States)

    Kikuta, Shingo; Kikawada, Takahiro; Hagiwara-Komoda, Yuka; Nakashima, Nobuhiko; Noda, Hiroaki

    2010-11-01

    The brown planthopper (BPH), Nilaparvata lugens, attacks rice plants and feeds on their phloem sap, which contains large amounts of sugars. The main sugar component of phloem sap is sucrose, a disaccharide composed of glucose and fructose. Sugars appear to be incorporated into the planthopper body by sugar transporters in the midgut. A total of 93 expressed sequence tags (ESTs) for putative sugar transporters were obtained from a BPH EST database, and 18 putative sugar transporter genes (Nlst1-18) were identified. The most abundantly expressed of these genes was Nlst1. This gene has previously been identified in the BPH as the glucose transporter gene NlHT1, which belongs to the major facilitator superfamily. Nlst1, 4, 6, 9, 12, 16, and 18 were highly expressed in the midgut, and Nlst2, 7, 8, 10, 15, 17, and 18 were highly expressed during the embryonic stages. Functional analyses were performed using Xenopus oocytes expressing NlST1 or 6. This showed that NlST6 is a facilitative glucose/fructose transporter that mediates sugar uptake from rice phloem sap in the BPH midgut in a manner similar to NlST1. Copyright © 2010 Elsevier Ltd. All rights reserved.

  5. Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

    Science.gov (United States)

    Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

    2011-08-01

    The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. Copyright © 2011 Elsevier Inc. All rights reserved.

  6. Discovery of Putative Herbicide Resistance Genes and Its Regulatory Network in Chickpea Using Transcriptome Sequencing

    Directory of Open Access Journals (Sweden)

    Mir A. Iquebal

    2017-06-01

    Full Text Available Background: Chickpea (Cicer arietinum L. contributes 75% of total pulse production. Being cheaper than animal protein, makes it important in dietary requirement of developing countries. Weed not only competes with chickpea resulting into drastic yield reduction but also creates problem of harboring fungi, bacterial diseases and insect pests. Chemical approach having new herbicide discovery has constraint of limited lead molecule options, statutory regulations and environmental clearance. Through genetic approach, transgenic herbicide tolerant crop has given successful result but led to serious concern over ecological safety thus non-transgenic approach like marker assisted selection is desirable. Since large variability in tolerance limit of herbicide already exists in chickpea varieties, thus the genes offering herbicide tolerance can be introgressed in variety improvement programme. Transcriptome studies can discover such associated key genes with herbicide tolerance in chickpea.Results: This is first transcriptomic studies of chickpea or even any legume crop using two herbicide susceptible and tolerant genotypes exposed to imidazoline (Imazethapyr. Approximately 90 million paired-end reads generated from four samples were processed and assembled into 30,803 contigs using reference based assembly. We report 6,310 differentially expressed genes (DEGs, of which 3,037 were regulated by 980 miRNAs, 1,528 transcription factors associated with 897 DEGs, 47 Hub proteins, 3,540 putative Simple Sequence Repeat-Functional Domain Marker (SSR-FDM, 13,778 genic Single Nucleotide Polymorphism (SNP putative markers and 1,174 Indels. Randomly selected 20 DEGs were validated using qPCR. Pathway analysis suggested that xenobiotic degradation related gene, glutathione S-transferase (GST were only up-regulated in presence of herbicide. Down-regulation of DNA replication genes and up-regulation of abscisic acid pathway genes were observed. Study further reveals

  7. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  8. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  9. Cultivation of hard-to-culture subsurface mercury-resistant bacteria and discovery of new merA gene sequences

    DEFF Research Database (Denmark)

    Rasmussen, L D; Zawadsky, C; Binnerup, S J

    2008-01-01

    different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial...... sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One...

  10. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    Directory of Open Access Journals (Sweden)

    Broberg Per

    2005-08-01

    Full Text Available Abstract Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information

  11. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  12. TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

  13. CLARM: An integrative approach for functional modules discovery

    KAUST Repository

    Salem, Saeed M.; Alroobi, Rami; Banitaan, Shadi; Seridi, Loqmane; Brewer, James E.; Aljarah, Ibrahim

    2011-01-01

    Functional module discovery aims to find well-connected subnetworks which can serve as candidate protein complexes. Advances in High-throughput proteomic technologies have enabled the collection of large amount of interaction data as well as gene expression data. We propose, CLARM, a clustering algorithm that integrates gene expression profiles and protein protein interaction network for biological modules discovery. The main premise is that by enriching the interaction network by adding interactions between genes which are highly co-expressed over a wide range of biological and environmental conditions, we can improve the quality of the discovered modules. Protein protein interactions, known protein complexes, and gene expression profiles for diverse environmental conditions from the yeast Saccharomyces cerevisiae were used for evaluate the biological significance of the reported modules. Our experiments show that the CLARM approach is competitive to wellestablished module discovery methods. Copyright © 2011 ACM.

  14. Cancer Biomarker Discovery: Lectin-Based Strategies Targeting Glycoproteins

    Directory of Open Access Journals (Sweden)

    David Clark

    2012-01-01

    Full Text Available Biomarker discovery can identify molecular markers in various cancers that can be used for detection, screening, diagnosis, and monitoring of disease progression. Lectin-affinity is a technique that can be used for the enrichment of glycoproteins from a complex sample, facilitating the discovery of novel cancer biomarkers associated with a disease state.

  15. Independent Gene Discovery and Testing

    Science.gov (United States)

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  16. RNA-Seq analysis and gene discovery of Andrias davidianus using Illumina short read sequencing.

    Directory of Open Access Journals (Sweden)

    Fenggang Li

    Full Text Available The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp. Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR. The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians.

  17. Target genes discovery through copy number alteration analysis in human hepatocellular carcinoma.

    Science.gov (United States)

    Gu, De-Leung; Chen, Yen-Hsieh; Shih, Jou-Ho; Lin, Chi-Hung; Jou, Yuh-Shan; Chen, Chian-Feng

    2013-12-21

    High-throughput short-read sequencing of exomes and whole cancer genomes in multiple human hepatocellular carcinoma (HCC) cohorts confirmed previously identified frequently mutated somatic genes, such as TP53, CTNNB1 and AXIN1, and identified several novel genes with moderate mutation frequencies, including ARID1A, ARID2, MLL, MLL2, MLL3, MLL4, IRF2, ATM, CDKN2A, FGF19, PIK3CA, RPS6KA3, JAK1, KEAP1, NFE2L2, C16orf62, LEPR, RAC2, and IL6ST. Functional classification of these mutated genes suggested that alterations in pathways participating in chromatin remodeling, Wnt/β-catenin signaling, JAK/STAT signaling, and oxidative stress play critical roles in HCC tumorigenesis. Nevertheless, because there are few druggable genes used in HCC therapy, the identification of new therapeutic targets through integrated genomic approaches remains an important task. Because a large amount of HCC genomic data genotyped by high density single nucleotide polymorphism arrays is deposited in the public domain, copy number alteration (CNA) analyses of these arrays is a cost-effective way to reveal target genes through profiling of recurrent and overlapping amplicons, homozygous deletions and potentially unbalanced chromosomal translocations accumulated during HCC progression. Moreover, integration of CNAs with other high-throughput genomic data, such as aberrantly coding transcriptomes and non-coding gene expression in human HCC tissues and rodent HCC models, provides lines of evidence that can be used to facilitate the identification of novel HCC target genes with the potential of improving the survival of HCC patients.

  18. Cyanobacteria: photosynthetic factories combining biodiversity, radiation resistance, and genetics to facilitate drug discovery.

    Science.gov (United States)

    Cassier-Chauvat, Corinne; Dive, Vincent; Chauvat, Franck

    2017-02-01

    Cyanobacteria are ancient, abundant, and widely diverse photosynthetic prokaryotes, which are viewed as promising cell factories for the ecologically responsible production of chemicals. Natural cyanobacteria synthesize a vast array of biologically active (secondary) metabolites with great potential for human health, while a few genetic models can be engineered for the (low level) production of biofuels. Recently, genome sequencing and mining has revealed that natural cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The corresponding panoply of enzymes (polyketide synthases and non-ribosomal peptide synthases) of interest for synthetic biology can still be increased through gene manipulations with the tools available for the few genetically manipulable strains. In this review, we propose to exploit the metabolic diversity and radiation resistance of cyanobacteria, and when required the genetics of model strains, for the production and radioactive ( 14 C) labeling of bioactive products, in order to facilitate the screening for new drugs.

  19. Exploiting Pre-rRNA Processing in Diamond Blackfan Anemia Gene Discovery and Diagnosis

    Science.gov (United States)

    Farrar, Jason E.; Quarello, Paola; Fisher, Ross; O’Brien, Kelly A.; Aspesi, Anna; Parrella, Sara; Henson, Adrianna L.; Seidel, Nancy E.; Atsidaftos, Eva; Prakash, Supraja; Bari, Shahla; Garelli, Emanuela; Arceci, Robert J.; Dianzani, Irma; Ramenghi, Ugo; Vlachos, Adrianna; Lipton, Jeffrey M.; Bodine, David M.; Ellis, Steven R.

    2014-01-01

    Diamond Blackfan anemia (DBA), a syndrome primarily characterized by anemia and physical abnormalities, is one among a group of related inherited bone marrow failure syndromes (IBMFS) which share overlapping clinical features. Heterozygous mutations or single-copy deletions have been identified in 12 ribosomal protein genes in approximately 60% of DBA cases, with the genetic etiology unexplained in most remaining patients. Unlike many IBMFS, for which functional screening assays complement clinical and genetic findings, suspected DBA in the absence of typical alterations of the known genes must frequently be diagnosed after exclusion of other IBMFS. We report here a novel deletion in a child that presented such a diagnostic challenge and prompted development of a novel functional assay that can assist in the diagnosis of a significant fraction of patients with DBA. The ribosomal proteins affected in DBA are required for pre-rRNA processing, a process which can be interrogated to monitor steps in the maturation of 40S and 60S ribosomal subunits. In contrast to prior methods used to assess pre-rRNA processing, the assay reported here, based on capillary electrophoresis measurement of the maturation of rRNA in pre-60S ribosomal subunits, would be readily amenable to use in diagnostic laboratories. In addition to utility as a diagnostic tool, we applied this technique to gene discovery in DBA, resulting in the identification of RPL31 as a novel DBA gene. PMID:25042156

  20. Gene discovery in Triatoma infestans

    Directory of Open Access Journals (Sweden)

    de Burgos Nelia

    2011-03-01

    Full Text Available Abstract Background Triatoma infestans is the most relevant vector of Chagas disease in the southern cone of South America. Since its genome has not yet been studied, sequencing of Expressed Sequence Tags (ESTs is one of the most powerful tools for efficiently identifying large numbers of expressed genes in this insect vector. Results In this work, we generated 826 ESTs, resulting in an increase of 47% in the number of ESTs available for T. infestans. These ESTs were assembled in 471 unique sequences, 151 of which represent 136 new genes for the Reduviidae family. Conclusions Among the putative new genes for the Reduviidae family, we identified and described an interesting subset of genes involved in development and reproduction, which constitute potential targets for insecticide development.

  1. Discovery of novel bacterial toxins by genomics and computational biology.

    Science.gov (United States)

    Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare

    2018-06-01

    Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.

  2. Helping Students Understand Gene Regulation with Online Tools: A Review of MEME and Melina II, Motif Discovery Tools for Active Learning in Biology

    Directory of Open Access Journals (Sweden)

    David Treves

    2012-08-01

    Full Text Available Review of: MEME and Melina II, which are two free and easy-to-use online motif discovery tools that can be employed to actively engage students in learning about gene regulatory elements.

  3. Arrayed antibody library technology for therapeutic biologic discovery.

    Science.gov (United States)

    Bentley, Cornelia A; Bazirgan, Omar A; Graziano, James J; Holmes, Evan M; Smider, Vaughn V

    2013-03-15

    Traditional immunization and display antibody discovery methods rely on competitive selection amongst a pool of antibodies to identify a lead. While this approach has led to many successful therapeutic antibodies, targets have been limited to proteins which are easily purified. In addition, selection driven discovery has produced a narrow range of antibody functionalities focused on high affinity antagonism. We review the current progress in developing arrayed protein libraries for screening-based, rather than selection-based, discovery. These single molecule per microtiter well libraries have been screened in multiplex formats against both purified antigens and directly against targets expressed on the cell surface. This facilitates the discovery of antibodies against therapeutically interesting targets (GPCRs, ion channels, and other multispanning membrane proteins) and epitopes that have been considered poorly accessible to conventional discovery methods. Copyright © 2013. Published by Elsevier Inc.

  4. Developmental transitions in Arabidopsis are regulated by antisense RNAs resulting from bidirectionally transcribed genes.

    Science.gov (United States)

    Krzyczmonik, Katarzyna; Wroblewska-Swiniarska, Agata; Swiezewski, Szymon

    2017-07-03

    Transcription terminators are DNA elements located at the 3' end of genes that ensure efficient cleavage of nascent RNA generating the 3' end of mRNA, as well as facilitating disengagement of elongating DNA-dependent RNA polymerase II. Surprisingly, terminators are also a potent source of antisense transcription. We have recently described an Arabidopsis antisense transcript originating from the 3' end of a master regulator of Arabidopsis thaliana seed dormancy DOG1. In this review, we discuss the broader implications of our discovery in light of recent developments in yeast and Arabidopsis. We show that, surprisingly, the key features of terminators that give rise to antisense transcription are preserved between Arabidopsis and yeast, suggesting a conserved mechanism. We also compare our discovery to known antisense-based regulatory mechanisms, highlighting the link between antisense-based gene expression regulation and major developmental transitions in plants.

  5. Biomimicry as a basis for drug discovery.

    Science.gov (United States)

    Kolb, V M

    1998-01-01

    Selected works are discussed which clearly demonstrate that mimicking various aspects of the process by which natural products evolved is becoming a powerful tool in contemporary drug discovery. Natural products are an established and rich source of drugs. The term "natural product" is often used synonymously with "secondary metabolite." Knowledge of genetics and molecular evolution helps us understand how biosynthesis of many classes of secondary metabolites evolved. One proposed hypothesis is termed "inventive evolution." It invokes duplication of genes, and mutation of the gene copies, among other genetic events. The modified duplicate genes, per se or in conjunction with other genetic events, may give rise to new enzymes, which, in turn, may generate new products, some of which may be selected for. Steps of the inventive evolution can be mimicked in several ways for purpose of drug discovery. For example, libraries of chemical compounds of any imaginable structure may be produced by combinatorial synthesis. Out of these libraries new active compounds can be selected. In another example, genetic system can be manipulated to produce modified natural products ("unnatural natural products"), from which new drugs can be selected. In some instances, similar natural products turn up in species that are not direct descendants of each other. This is presumably due to a horizontal gene transfer. The mechanism of this inter-species gene transfer can be mimicked in therapeutic gene delivery. Mimicking specifics or principles of chemical evolution including experimental and test-tube evolution also provides leads for new drug discovery.

  6. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    Science.gov (United States)

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Drosophila TDP-43 RNA-Binding Protein Facilitates Association of Sister Chromatid Cohesion Proteins with Genes, Enhancers and Polycomb Response Elements.

    Directory of Open Access Journals (Sweden)

    Amanda Swain

    2016-09-01

    Full Text Available The cohesin protein complex mediates sister chromatid cohesion and participates in transcriptional control of genes that regulate growth and development. Substantial reduction of cohesin activity alters transcription of many genes without disrupting chromosome segregation. Drosophila Nipped-B protein loads cohesin onto chromosomes, and together Nipped-B and cohesin occupy essentially all active transcriptional enhancers and a large fraction of active genes. It is unknown why some active genes bind high levels of cohesin and some do not. Here we show that the TBPH and Lark RNA-binding proteins influence association of Nipped-B and cohesin with genes and gene regulatory sequences. In vitro, TBPH and Lark proteins specifically bind RNAs produced by genes occupied by Nipped-B and cohesin. By genomic chromatin immunoprecipitation these RNA-binding proteins also bind to chromosomes at cohesin-binding genes, enhancers, and Polycomb response elements (PREs. RNAi depletion reveals that TBPH facilitates association of Nipped-B and cohesin with genes and regulatory sequences. Lark reduces binding of Nipped-B and cohesin at many promoters and aids their association with several large enhancers. Conversely, Nipped-B facilitates TBPH and Lark association with genes and regulatory sequences, and interacts with TBPH and Lark in affinity chromatography and immunoprecipitation experiments. Blocking transcription does not ablate binding of Nipped-B and the RNA-binding proteins to chromosomes, indicating transcription is not required to maintain binding once established. These findings demonstrate that RNA-binding proteins help govern association of sister chromatid cohesion proteins with genes and enhancers.

  8. Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.

    Science.gov (United States)

    Liseron-Monfils, Christophe; Lewis, Tim; Ashlock, Daniel; McNicholas, Paul D; Fauteux, François; Strömvik, Martina; Raizada, Manish N

    2013-03-15

    The discovery of genetic networks and cis-acting DNA motifs underlying their regulation is a major objective of transcriptome studies. The recent release of the maize genome (Zea mays L.) has facilitated in silico searches for regulatory motifs. Several algorithms exist to predict cis-acting elements, but none have been adapted for maize. A benchmark data set was used to evaluate the accuracy of three motif discovery programs: BioProspector, Weeder and MEME. Analysis showed that each motif discovery tool had limited accuracy and appeared to retrieve a distinct set of motifs. Therefore, using the benchmark, statistical filters were optimized to reduce the false discovery ratio, and then remaining motifs from all programs were combined to improve motif prediction. These principles were integrated into a user-friendly pipeline for motif discovery in maize called Promzea, available at http://www.promzea.org and on the Discovery Environment of the iPlant Collaborative website. Promzea was subsequently expanded to include rice and Arabidopsis. Within Promzea, a user enters cDNA sequences or gene IDs; corresponding upstream sequences are retrieved from the maize genome. Predicted motifs are filtered, combined and ranked. Promzea searches the chosen plant genome for genes containing each candidate motif, providing the user with the gene list and corresponding gene annotations. Promzea was validated in silico using a benchmark data set: the Promzea pipeline showed a 22% increase in nucleotide sensitivity compared to the best standalone program tool, Weeder, with equivalent nucleotide specificity. Promzea was also validated by its ability to retrieve the experimentally defined binding sites of transcription factors that regulate the maize anthocyanin and phlobaphene biosynthetic pathways. Promzea predicted additional promoter motifs, and genome-wide motif searches by Promzea identified 127 non-anthocyanin/phlobaphene genes that each contained all five predicted promoter

  9. Forever in Bluegenes: BlueGenes / InterMine Poster for BOSC 2017

    OpenAIRE

    Yehudi, Yo; Mine, Inter; Clark-Casey, Justin; Butano, Daniela

    2017-01-01

    The plight of the computational biologist, or indeed any data scientist, often boils down to data: data may be badly formatted, missing information, hard to access and hard to integrate with other sources. Similarly, user interfaces designed for data analysis may not be easy to use, hindering scientific analysis rather than assisting it.BlueGenes is designed to facilitate biological data discovery and analysis in a user-friendly and enjoyable way, without requiring that scientists write code ...

  10. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Directory of Open Access Journals (Sweden)

    Adam Y Ye

    Full Text Available Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD (http://htd.cbi.pku.edu.cn. Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  11. On reliable discovery of molecular signatures

    Directory of Open Access Journals (Sweden)

    Björkegren Johan

    2009-01-01

    Full Text Available Abstract Background Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability. Results We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings. Conclusion Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.

  12. Antibiotic discovery throughout the Small World Initiative: A molecular strategy to identify biosynthetic gene clusters involved in antagonistic activity.

    Science.gov (United States)

    Davis, Elizabeth; Sloan, Tyler; Aurelius, Krista; Barbour, Angela; Bodey, Elijah; Clark, Brigette; Dennis, Celeste; Drown, Rachel; Fleming, Megan; Humbert, Allison; Glasgo, Elizabeth; Kerns, Trent; Lingro, Kelly; McMillin, MacKenzie; Meyer, Aaron; Pope, Breanna; Stalevicz, April; Steffen, Brittney; Steindl, Austin; Williams, Carolyn; Wimberley, Carmen; Zenas, Robert; Butela, Kristen; Wildschutte, Hans

    2017-06-01

    The emergence of bacterial pathogens resistant to all known antibiotics is a global health crisis. Adding to this problem is that major pharmaceutical companies have shifted away from antibiotic discovery due to low profitability. As a result, the pipeline of new antibiotics is essentially dry and many bacteria now resist the effects of most commonly used drugs. To address this global health concern, citizen science through the Small World Initiative (SWI) was formed in 2012. As part of SWI, students isolate bacteria from their local environments, characterize the strains, and assay for antibiotic production. During the 2015 fall semester at Bowling Green State University, students isolated 77 soil-derived bacteria and genetically characterized strains using the 16S rRNA gene, identified strains exhibiting antagonistic activity, and performed an expanded SWI workflow using transposon mutagenesis to identify a biosynthetic gene cluster involved in toxigenic compound production. We identified one mutant with loss of antagonistic activity and through subsequent whole-genome sequencing and linker-mediated PCR identified a 24.9 kb biosynthetic gene locus likely involved in inhibitory activity in that mutant. Further assessment against human pathogens demonstrated the inhibition of Bacillus cereus, Listeria monocytogenes, and methicillin-resistant Staphylococcus aureus in the presence of this compound, thus supporting our molecular strategy as an effective research pipeline for SWI antibiotic discovery and genetic characterization. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  13. The Biomedical Resource Ontology (BRO) to enable resource discovery in clinical and translational research.

    Science.gov (United States)

    Tenenbaum, Jessica D; Whetzel, Patricia L; Anderson, Kent; Borromeo, Charles D; Dinov, Ivo D; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D; Becich, Michael J; Ginsburg, Geoffrey S; Musen, Mark A; Smith, Kevin A; Tarantal, Alice F; Rubin, Daniel L; Lyster, Peter

    2011-02-01

    The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. Copyright © 2010 Elsevier Inc. All rights reserved.

  14. Effector genomics accelerates discovery and functional profiling of potato disease resistance and phytophthora infestans avirulence genes.

    Directory of Open Access Journals (Sweden)

    Vivianne G A A Vleeshouwers

    Full Text Available Potato is the world's fourth largest food crop yet it continues to endure late blight, a devastating disease caused by the Irish famine pathogen Phytophthora infestans. Breeding broad-spectrum disease resistance (R genes into potato (Solanum tuberosum is the best strategy for genetically managing late blight but current approaches are slow and inefficient. We used a repertoire of effector genes predicted computationally from the P. infestans genome to accelerate the identification, functional characterization, and cloning of potentially broad-spectrum R genes. An initial set of 54 effectors containing a signal peptide and a RXLR motif was profiled for activation of innate immunity (avirulence or Avr activity on wild Solanum species and tentative Avr candidates were identified. The RXLR effector family IpiO induced hypersensitive responses (HR in S. stoloniferum, S. papita and the more distantly related S. bulbocastanum, the source of the R gene Rpi-blb1. Genetic studies with S. stoloniferum showed cosegregation of resistance to P. infestans and response to IpiO. Transient co-expression of IpiO with Rpi-blb1 in a heterologous Nicotiana benthamiana system identified IpiO as Avr-blb1. A candidate gene approach led to the rapid cloning of S. stoloniferum Rpi-sto1 and S. papita Rpi-pta1, which are functionally equivalent to Rpi-blb1. Our findings indicate that effector genomics enables discovery and functional profiling of late blight R genes and Avr genes at an unprecedented rate and promises to accelerate the engineering of late blight resistant potato varieties.

  15. Genetic and epigenetic control of gene expression by CRISPR–Cas systems

    Science.gov (United States)

    Lo, Albert; Qi, Lei

    2017-01-01

    The discovery and adaption of bacterial clustered regularly interspaced short palindromic repeats (CRISPR)–CRISPR-associated (Cas) systems has revolutionized the way researchers edit genomes. Engineering of catalytically inactivated Cas variants (nuclease-deficient or nuclease-deactivated [dCas]) combined with transcriptional repressors, activators, or epigenetic modifiers enable sequence-specific regulation of gene expression and chromatin state. These CRISPR–Cas-based technologies have contributed to the rapid development of disease models and functional genomics screening approaches, which can facilitate genetic target identification and drug discovery. In this short review, we will cover recent advances of CRISPR–dCas9 systems and their use for transcriptional repression and activation, epigenome editing, and engineered synthetic circuits for complex control of the mammalian genome. PMID:28649363

  16. Cancer in silico drug discovery: a systems biology tool for identifying candidate drugs to target specific molecular tumor subtypes.

    Science.gov (United States)

    San Lucas, F Anthony; Fowler, Jerry; Chang, Kyle; Kopetz, Scott; Vilar, Eduardo; Scheet, Paul

    2014-12-01

    Large-scale cancer datasets such as The Cancer Genome Atlas (TCGA) allow researchers to profile tumors based on a wide range of clinical and molecular characteristics. Subsequently, TCGA-derived gene expression profiles can be analyzed with the Connectivity Map (CMap) to find candidate drugs to target tumors with specific clinical phenotypes or molecular characteristics. This represents a powerful computational approach for candidate drug identification, but due to the complexity of TCGA and technology differences between CMap and TCGA experiments, such analyses are challenging to conduct and reproduce. We present Cancer in silico Drug Discovery (CiDD; scheet.org/software), a computational drug discovery platform that addresses these challenges. CiDD integrates data from TCGA, CMap, and Cancer Cell Line Encyclopedia (CCLE) to perform computational drug discovery experiments, generating hypotheses for the following three general problems: (i) determining whether specific clinical phenotypes or molecular characteristics are associated with unique gene expression signatures; (ii) finding candidate drugs to repress these expression signatures; and (iii) identifying cell lines that resemble the tumors being studied for subsequent in vitro experiments. The primary input to CiDD is a clinical or molecular characteristic. The output is a biologically annotated list of candidate drugs and a list of cell lines for in vitro experimentation. We applied CiDD to identify candidate drugs to treat colorectal cancers harboring mutations in BRAF. CiDD identified EGFR and proteasome inhibitors, while proposing five cell lines for in vitro testing. CiDD facilitates phenotype-driven, systematic drug discovery based on clinical and molecular data from TCGA. ©2014 American Association for Cancer Research.

  17. Discovery of Antibiotics-derived Polymers for Gene Delivery using Combinatorial Synthesis and Cheminformatics Modeling

    Science.gov (United States)

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D.; Ramos, James; Breneman, Curt M.; Rege, Kaushal

    2014-01-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and ‘building block’ polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. PMID:24331709

  18. Spread of a new parasitic B chromosome variant is facilitated by high gene flow.

    Directory of Open Access Journals (Sweden)

    María Inmaculada Manrique-Poyato

    Full Text Available The B24 chromosome variant emerged several decades ago in a Spanish population of the grasshopper Eyprepocnemis plorans and is currently reaching adjacent populations. Here we report, for the first time, how a parasitic B chromosome (a strictly vertically transmitted parasite expands its geographical range aided by high gene flow in the host species. For six years we analyzed B frequency in several populations to the east and west of the original population and found extensive spatial variation, but only a slight temporal trend. The highest B24 frequency was found in its original population (Torrox and it decreased closer to both the eastern and the western populations. The analysis of Inter Simple Sequence Repeat (ISSR markers showed the existence of a low but significant degree of population subdivision, as well as significant isolation by distance (IBD. Pairwise Nem estimates suggested the existence of high gene flow between the four populations located in the Torrox area, with higher values towards the east. No significant barriers to gene flow were found among these four populations, and we conclude that high gene flow is facilitating B24 diffusion both eastward and westward, with minor role for B24 drive due to the arrival of drive suppressor genes which are also frequent in the donor population.

  19. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases.

    Science.gov (United States)

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM)-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.

  20. Model-driven discovery of underground metabolic functions in Escherichia coli

    DEFF Research Database (Denmark)

    Guzmán, Gabriela I.; Utrilla, José; Nurk, Sergey

    2015-01-01

    -scale models, which have been widely used for predicting growth phenotypes in various environments or following a genetic perturbation; however, these predictions occasionally fail. Failed predictions of gene essentiality offer an opportunity for targeting biological discovery, suggesting the presence......E, and gltA and prpC. This study demonstrates how a targeted model-driven approach to discovery can systematically fill knowledge gaps, characterize underground metabolism, and elucidate regulatory mechanisms of adaptation in response to gene KO perturbations....

  1. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  2. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Directory of Open Access Journals (Sweden)

    Juliette J Kahle

    2013-03-01

    Full Text Available Hematopoietic stem cells (HSCs are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  3. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  4. Candidate Essential Genes in Burkholderia cenocepacia J2315 Identified by Genome-Wide TraDIS

    KAUST Repository

    Wong, Yee-Chin

    2016-08-22

    Burkholderia cenocepacia infection often leads to fatal cepacia syndrome in cystic fibrosis patients. However, antibiotic therapy rarely results in complete eradication of the pathogen due to its intrinsic resistance to many clinically available antibiotics. Recent attention has turned to the identification of essential genes as the proteins encoded by these genes may serve as potential targets for development of novel antimicrobials. In this study, we utilized TraDIS (Transposon Directed Insertion-site Sequencing) as a genome-wide screening tool to facilitate the identification of B. cenocepacia genes essential for its growth and viability. A transposon mutant pool consisting of approximately 500,000 mutants was successfully constructed, with more than 400,000 unique transposon insertion sites identified by computational analysis of TraDIS datasets. The saturated library allowed for the identification of 383 genes that were predicted to be essential in B. cenocepacia. We extended the application of TraDIS to identify conditionally essential genes required for in vitro growth and revealed an additional repertoire of 439 genes to be crucial for B. cenocepacia growth under nutrient-depleted conditions. The library of B. cenocepacia mutants can subsequently be subjected to various biologically related conditions to facilitate the discovery of genes involved in niche adaptation as well as pathogenicity and virulence.

  5. Candidate Essential Genes in Burkholderia cenocepacia J2315 Identified by Genome-Wide TraDIS

    KAUST Repository

    Wong, Yee-Chin; Abd El Ghany, Moataz; Naeem, Raeece; Lee, Kok-Wei; Tan, Yung-Chie; Pain, Arnab; Nathan, Sheila

    2016-01-01

    Burkholderia cenocepacia infection often leads to fatal cepacia syndrome in cystic fibrosis patients. However, antibiotic therapy rarely results in complete eradication of the pathogen due to its intrinsic resistance to many clinically available antibiotics. Recent attention has turned to the identification of essential genes as the proteins encoded by these genes may serve as potential targets for development of novel antimicrobials. In this study, we utilized TraDIS (Transposon Directed Insertion-site Sequencing) as a genome-wide screening tool to facilitate the identification of B. cenocepacia genes essential for its growth and viability. A transposon mutant pool consisting of approximately 500,000 mutants was successfully constructed, with more than 400,000 unique transposon insertion sites identified by computational analysis of TraDIS datasets. The saturated library allowed for the identification of 383 genes that were predicted to be essential in B. cenocepacia. We extended the application of TraDIS to identify conditionally essential genes required for in vitro growth and revealed an additional repertoire of 439 genes to be crucial for B. cenocepacia growth under nutrient-depleted conditions. The library of B. cenocepacia mutants can subsequently be subjected to various biologically related conditions to facilitate the discovery of genes involved in niche adaptation as well as pathogenicity and virulence.

  6. Candidate essential genes in Burkholderia cenocepacia J2315 identified by genome-wide TraDIS

    Directory of Open Access Journals (Sweden)

    Yee-Chin Wong

    2016-08-01

    Full Text Available Burkholderia cenocepacia infection often leads to fatal cepacia syndrome in cystic fibrosis patients. However, antibiotic therapy rarely results in complete eradication of the pathogen due to its intrinsic resistance to many clinically available antibiotics. Recent attention has turned to the identification of essential genes as the proteins encoded by these genes may serve as potential targets for development of novel antimicrobials. In this study, we utilized TraDIS (Transposon Directed Insertion-site Sequencing as a genome-wide screening tool to facilitate the identification of B. cenocepacia genes essential for its growth and viability. A transposon mutant pool consisting of approximately 500,000 mutants was successfully constructed, with more than 400,000 unique transposon insertion sites identified by computational analysis of TraDIS datasets. The saturated library allowed for the identification of 383 genes that were predicted to be essential in B. cenocepacia. We extended the application of TraDIS to identify conditionally essential genes required for in vitro growth and revealed an additional repertoire of 439 genes to be crucial for B. cenocepacia growth under nutrient-depleted conditions. The library of B. cenocepacia mutants can subsequently be subjected to various biologically related conditions to facilitate the discovery of genes involved in niche adaptation as well as pathogenicity and virulence.

  7. A large-scale chromosome-specific SNP discovery guideline.

    Science.gov (United States)

    Akpinar, Bala Ani; Lucas, Stuart; Budak, Hikmet

    2017-01-01

    Single-nucleotide polymorphisms (SNPs) are the most prevalent type of variation in genomes that are increasingly being used as molecular markers in diversity analyses, mapping and cloning of genes, and germplasm characterization. However, only a few studies reported large-scale SNP discovery in Aegilops tauschii, restricting their potential use as markers for the low-polymorphic D genome. Here, we report 68,592 SNPs found on the gene-related sequences of the 5D chromosome of Ae. tauschii genotype MvGB589 using genomic and transcriptomic sequences from seven Ae. tauschii accessions, including AL8/78, the only genotype for which a draft genome sequence is available at present. We also suggest a workflow to compare SNP positions in homologous regions on the 5D chromosome of Triticum aestivum, bread wheat, to mark single nucleotide variations between these closely related species. Overall, the identified SNPs define a density of 4.49 SNPs per kilobyte, among the highest reported for the genic regions of Ae. tauschii so far. To our knowledge, this study also presents the first chromosome-specific SNP catalog in Ae. tauschii that should facilitate the association of these SNPs with morphological traits on chromosome 5D to be ultimately targeted for wheat improvement.

  8. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  9. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Konstantin Okonechnikov

    Full Text Available Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.

  10. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    Science.gov (United States)

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813

  11. Functional Gene Discovery and Characterization of Genes and Alleles Affecting Wood Biomass Yield and Quality in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Busov, Victor [Michigan Technological Univ., Houghton, MI (United States)

    2017-02-12

    Adoption of biofuels as economically and environmentally viable alternative to fossil fuels would require development of specialized bioenergy varieties. A major goal in the breeding of such varieties is the improvement of lignocellulosic biomass yield and quality. These are complex traits and understanding the underpinning molecular mechanism can assist and accelerate their improvement. This is particularly important for tree bioenergy crops like poplars (species and hybrids from the genus Populus), for which breeding progress is extremely slow due to long generation cycles. A variety of approaches have been already undertaken to better understand the molecular bases of biomass yield and quality in poplar. An obvious void in these undertakings has been the application of mutagenesis. Mutagenesis has been instrumental in the discovery and characterization of many plant traits including such that affect biomass yield and quality. In this proposal we use activation tagging to discover genes that can significantly affect biomass associated traits directly in poplar, a premier bioenergy crop. We screened a population of 5,000 independent poplar activation tagging lines under greenhouse conditions for a battery of biomass yield traits. These same plants were then analyzed for changes in wood chemistry using pyMBMS. As a result of these screens we have identified nearly 800 mutants, which are significantly (P<0.05) different when compared to wild type. Of these majority (~700) are affected in one of ten different biomass yield traits and 100 in biomass quality traits (e.g., lignin, S/G ration and C6/C5 sugars). We successfully recovered the position of the tag in approximately 130 lines, showed activation in nearly half of them and performed recapitulation experiments with 20 genes prioritized by the significance of the phenotype. Recapitulation experiments are still ongoing for many of the genes but the results are encouraging. For example, we have shown successful

  12. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-01-01

    and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. mi

  13. Semantic Approaches for Knowledge Discovery and Retrieval in Biomedicine

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej

    This thesis discusses potential applications of semantics to the recent literaturebased informatics systems to facilitate knowledge discovery, hypothesis generation, and literature retrieval in the domain of biomedicine. The approaches presented herein make use of semantic information extracted...

  14. Gene/QTL discovery for Anthracnose in common bean (Phaseolus vulgaris L.) from North-western Himalayas.

    Science.gov (United States)

    Choudhary, Neeraj; Bawa, Vanya; Paliwal, Rajneesh; Singh, Bikram; Bhat, Mohd Ashraf; Mir, Javid Iqbal; Gupta, Moni; Sofi, Parvaze A; Thudi, Mahendar; Varshney, Rajeev K; Mir, Reyazul Rouf

    2018-01-01

    Common bean (Phaseolus vulgaris L.) is one of the most important grain legume crops in the world. The beans grown in north-western Himalayas possess huge diversity for seed color, shape and size but are mostly susceptible to Anthracnose disease caused by seed born fungus Colletotrichum lindemuthianum. Dozens of QTLs/genes have been already identified for this disease in common bean world-wide. However, this is the first report of gene/QTL discovery for Anthracnose using bean germplasm from north-western Himalayas of state Jammu & Kashmir, India. A core set of 96 bean lines comprising 54 indigenous local landraces from 11 hot-spots and 42 exotic lines from 10 different countries were phenotyped at two locations (SKUAST-Jammu and Bhaderwah, Jammu) for Anthracnose resistance. The core set was also genotyped with genome-wide (91) random and trait linked SSR markers. The study of marker-trait associations (MTAs) led to the identification of 10 QTLs/genes for Anthracnose resistance. Among the 10 QTLs/genes identified, two MTAs are stable (BM45 & BM211), two MTAs (PVctt1 & BM211) are major explaining more than 20% phenotypic variation for Anthracnose and one MTA (BM211) is both stable and major. Six (06) genomic regions are reported for the first time, while as four (04) genomic regions validated the already known QTL/gene regions/clusters for Anthracnose. The major, stable and validated markers reported during the present study associated with Anthracnose resistance will prove useful in common bean molecular breeding programs aimed at enhancing Anthracnose resistance of local bean landraces grown in north-western Himalayas of state Jammu and Kashmir.

  15. Ancient horizontal gene transfer from bacteria enhances biosynthetic capabilities of fungi.

    Directory of Open Access Journals (Sweden)

    Imke Schmitt

    Full Text Available Polyketides are natural products with a wide range of biological functions and pharmaceutical applications. Discovery and utilization of polyketides can be facilitated by understanding the evolutionary processes that gave rise to the biosynthetic machinery and the natural product potential of extant organisms. Gene duplication and subfunctionalization, as well as horizontal gene transfer are proposed mechanisms in the evolution of biosynthetic gene clusters. To explain the amount of homology in some polyketide synthases in unrelated organisms such as bacteria and fungi, interkingdom horizontal gene transfer has been evoked as the most likely evolutionary scenario. However, the origin of the genes and the direction of the transfer remained elusive.We used comparative phylogenetics to infer the ancestor of a group of polyketide synthase genes involved in antibiotic and mycotoxin production. We aligned keto synthase domain sequences of all available fungal 6-methylsalicylic acid (6-MSA-type PKSs and their closest bacterial relatives. To assess the role of symbiotic fungi in the evolution of this gene we generated 24 6-MSA synthase sequence tags from lichen-forming fungi. Our results support an ancient horizontal gene transfer event from an actinobacterial source into ascomycete fungi, followed by gene duplication.Given that actinobacteria are unrivaled producers of biologically active compounds, such as antibiotics, it appears particularly promising to study biosynthetic genes of actinobacterial origin in fungi. The large number of 6-MSA-type PKS sequences found in lichen-forming fungi leads us hypothesize that the evolution of typical lichen compounds, such as orsellinic acid derivatives, was facilitated by the gain of this bacterial polyketide synthase.

  16. Crowdsourcing the nodulation gene network discovery environment.

    Science.gov (United States)

    Li, Yupeng; Jackson, Scott A

    2016-05-26

    The Legumes (Fabaceae) are an economically and ecologically important group of plant species with the conspicuous capacity for symbiotic nitrogen fixation in root nodules, specialized plant organs containing symbiotic microbes. With the aim of understanding the underlying molecular mechanisms leading to nodulation, many efforts are underway to identify nodulation-related genes and determine how these genes interact with each other. In order to accurately and efficiently reconstruct nodulation gene network, a crowdsourcing platform, CrowdNodNet, was created. The platform implements the jQuery and vis.js JavaScript libraries, so that users are able to interactively visualize and edit the gene network, and easily access the information about the network, e.g. gene lists, gene interactions and gene functional annotations. In addition, all the gene information is written on MediaWiki pages, enabling users to edit and contribute to the network curation. Utilizing the continuously updated, collaboratively written, and community-reviewed Wikipedia model, the platform could, in a short time, become a comprehensive knowledge base of nodulation-related pathways. The platform could also be used for other biological processes, and thus has great potential for integrating and advancing our understanding of the functional genomics and systems biology of any process for any species. The platform is available at http://crowd.bioops.info/ , and the source code can be openly accessed at https://github.com/bioops/crowdnodnet under MIT License.

  17. The limits of de novo DNA motif discovery.

    Directory of Open Access Journals (Sweden)

    David Simcha

    Full Text Available A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of

  18. Sequencing genes in silico using single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  19. A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model

    Directory of Open Access Journals (Sweden)

    Mickael Orgeur

    2018-01-01

    Full Text Available The sequence of the chicken genome, like several other draft genome sequences, is presently not fully covered. Gaps, contigs assigned with low confidence and uncharacterized chromosomes result in gene fragmentation and imprecise gene annotation. Transcript abundance estimation from RNA sequencing (RNA-seq data relies on read quality, library complexity and expression normalization. In addition, the quality of the genome sequence used to map sequencing reads, and the gene annotation that defines gene features, must also be taken into account. A partially covered genome sequence causes the loss of sequencing reads from the mapping step, while an inaccurate definition of gene features induces imprecise read counts from the assignment step. Both steps can significantly bias interpretation of RNA-seq data. Here, we describe a dual transcript-discovery approach combining a genome-guided gene prediction and a de novo transcriptome assembly. This dual approach enabled us to increase the assignment rate of RNA-seq data by nearly 20% as compared to when using only the chicken reference annotation, contributing therefore to a more accurate estimation of transcript abundance. More generally, this strategy could be applied to any organism with partial genome sequence and/or lacking a manually-curated reference annotation in order to improve the accuracy of gene expression studies.

  20. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    KAUST Repository

    Rungrat, Tepsuda

    2016-09-09

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana.

  1. The Utility of Next Generation Sequencing in Gene Discovery for Mutation-negative Patients with Rett Syndrome

    Directory of Open Access Journals (Sweden)

    Wendy Anne Gold

    2015-07-01

    Full Text Available Rett syndrome (RTT is a rare, severe disorder of neuronal plasticity that predominantly affects girls. Girls with RTT usually appear asymptomatic in the first 6-18 months of life, but gradually develop severe motor, cognitive and behavioural abnormalities that persist for life. A predominance of neuronal and synaptic dysfunction, with altered excitatory-inhibitory neuronal synaptic transmission and synaptic plasticity are overarching features of RTT in children and in mouse models. Approximately 95% of patients with classical RTT have mutations in the X-linked methyl-CpG-binding (MECP2 gene, whilst other genes, including cyclin-dependent kinase-like 5 (CDKL5, Forkhead box protein G1 (FOXG1, Myocyte-specific enhancer factor 2C (MEF2C and Transcription factor 4 (TCF4, have been associated with phenotypes overlapping with RTT. However, there remain a proportion of patients who carry a clinical diagnosis of RTT, but who are mutation negative. In recent years, next-generation sequencing (NGS technologies have revolutionized approaches to genetic studies, making whole-exome and even whole-genome sequencing possible strategies for the detection of rare and de novo mutations, aiding the discovery of novel disease genes. Here, we review the recent progress that is emerging in identifying pathogenic variations, specifically from exome sequencing in RTT patients, and emphasize the need for the use of this technology to identify known and new disease genes in RTT patients.

  2. Bioinformatics for discovery of microbiome variation

    DEFF Research Database (Denmark)

    Brejnrod, Asker Daniel

    of various molecular methods to build hypotheses about the impact of a copper contaminated soil. The introduction is a broad introduction to the field of microbiome research with a focus on the technologies that enable these discoveries and how some of the broader issues have related to this thesis......Sequencing based tools have revolutionized microbiology in recent years. Highthroughput DNA sequencing have allowed high-resolution studies on microbial life in many different environments and at unprecedented low cost. These culture-independent methods have helped discovery of novel bacteria...... 1 ,“Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies”, benchmarked the performance of a variety of popular statistical methods for discovering differentially abundant bacteria . between...

  3. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy

    Science.gov (United States)

    2011-01-01

    Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting. PMID:21447194

  4. Using Just-in-Time Information to Support Scientific Discovery Learning in a Computer-Based Simulation

    Science.gov (United States)

    Hulshof, Casper D.; de Jong, Ton

    2006-01-01

    Students encounter many obstacles during scientific discovery learning with computer-based simulations. It is hypothesized that an effective type of support, that does not interfere with the scientific discovery learning process, should be delivered on a "just-in-time" base. This study explores the effect of facilitating access to…

  5. STAT3 Target Genes Relevant to Human Cancers

    International Nuclear Information System (INIS)

    Carpenter, Richard L.; Lo, Hui-Wen

    2014-01-01

    Since its discovery, the STAT3 transcription factor has been extensively studied for its function as a transcriptional regulator and its role as a mediator of development, normal physiology, and pathology of many diseases, including cancers. These efforts have uncovered an array of genes that can be positively and negatively regulated by STAT3, alone and in cooperation with other transcription factors. Through regulating gene expression, STAT3 has been demonstrated to play a pivotal role in many cellular processes including oncogenesis, tumor growth and progression, and stemness. Interestingly, recent studies suggest that STAT3 may behave as a tumor suppressor by activating expression of genes known to inhibit tumorigenesis. Additional evidence suggested that STAT3 may elicit opposing effects depending on cellular context and tumor types. These mixed results signify the need for a deeper understanding of STAT3, including its upstream regulators, parallel transcription co-regulators, and downstream target genes. To help facilitate fulfilling this unmet need, this review will be primarily focused on STAT3 downstream target genes that have been validated to associate with tumorigenesis and/or malignant biology of human cancers

  6. Arid5b facilitates chondrogenesis by recruiting the histone demethylase Phf2 to Sox9-regulated genes

    Science.gov (United States)

    Hata, Kenji; Takashima, Rikako; Amano, Katsuhiko; Ono, Koichiro; Nakanishi, Masako; Yoshida, Michiko; Wakabayashi, Makoto; Matsuda, Akio; Maeda, Yoshinobu; Suzuki, Yutaka; Sugano, Sumio; Whitson, Robert H.; Nishimura, Riko; Yoneda, Toshiyuki

    2013-11-01

    Histone modification, a critical step for epigenetic regulation, is an important modulator of biological events. Sox9 is a transcription factor critical for endochondral ossification; however, proof of its epigenetic regulation remains elusive. Here we identify AT-rich interactive domain 5b (Arid5b) as a transcriptional co-regulator of Sox9. Arid5b physically associates with Sox9 and synergistically induces chondrogenesis. Growth of Arid5b-/- mice is retarded with delayed endochondral ossification. Sox9-dependent chondrogenesis is attenuated in Arid5b-deficient cells. Arid5b recruits Phf2, a histone lysine demethylase, to the promoter region of Sox9 target genes and stimulates H3K9me2 demethylation of these genes. In the promoters of chondrogenic marker genes, H3K9me2 levels are increased in Arid5b-/- chondrocytes. Finally, we show that Phf2 knockdown inhibits Sox9-induced chondrocyte differentiation. Our findings establish an epigenomic mechanism of skeletal development, whereby Arid5b promotes chondrogenesis by facilitating Phf2-mediated histone demethylation of Sox9-regulated chondrogenic gene promoters.

  7. The gene trap resource: a treasure trove for hemopoiesis research.

    Science.gov (United States)

    Forrai, Ariel; Robb, Lorraine

    2005-08-01

    The laboratory mouse is an invaluable tool for functional gene discovery because of its genetic malleability and a biological similarity to human systems that facilitates identification of human models of disease. A number of mutagenic technologies are being used to elucidate gene function in the mouse. Gene trapping is an insertional mutagenesis strategy that is being undertaken by multiple research groups, both academic and private, in an effort to introduce mutations across the mouse genome. Large-scale, publicly funded gene trap programs have been initiated in several countries with the International Gene Trap Consortium coordinating certain efforts and resources. We outline the methodology of mammalian gene trapping and how it can be used to identify genes expressed in both primitive and definitive blood cells and to discover hemopoietic regulator genes. Mouse mutants with hematopoietic phenotypes derived using gene trapping are described. The efforts of the large-scale gene trapping consortia have now led to the availability of libraries of mutagenized ES cell clones. The identity of the trapped locus in each of these clones can be identified by sequence-based searching via the world wide web. This resource provides an extraordinary tool for all researchers wishing to use mouse genetics to understand gene function.

  8. Targeted discovery of glycoside hydrolases from a switchgrass-adapted compost community

    Energy Technology Data Exchange (ETDEWEB)

    Allgaier, M.; Reddy, A.; Park, J. I.; Ivanova, N.; D' haeseleer, P.; Lowry, S.; Sapra, R.; Hazen, T.C.; Simmons, B.A.; VanderGheynst, J. S.; Hugenholtz, P.

    2009-11-15

    Development of cellulosic biofuels from non-food crops is currently an area of intense research interest. Tailoring depolymerizing enzymes to particular feedstocks and pretreatment conditions is one promising avenue of research in this area. Here we added a green-waste compost inoculum to switchgrass (Panicum virgatum) and simulated thermophilic composting in a bioreactor to select for a switchgrass-adapted community and to facilitate targeted discovery of glycoside hydrolases. Small-subunit (SSU) rRNA-based community profiles revealed that the microbial community changed dramatically between the initial and switchgrass-adapted compost (SAC) with some bacterial populations being enriched over 20-fold. We obtained 225 Mbp of 454-titanium pyrosequence data from the SAC community and conservatively identified 800 genes encoding glycoside hydrolase domains that were biased toward depolymerizing grass cell wall components. Of these, {approx}10% were putative cellulases mostly belonging to families GH5 and GH9. We synthesized two SAC GH9 genes with codon optimization for heterologous expression in Escherichia coli and observed activity for one on carboxymethyl cellulose. The active GH9 enzyme has a temperature optimum of 50 C and pH range of 5.5 to 8 consistent with the composting conditions applied. We demonstrate that microbial communities adapt to switchgrass decomposition using simulated composting condition and that full-length genes can be identified from complex metagenomic sequence data, synthesized and expressed resulting in active enzyme.

  9. Targeted Discovery of Glycoside Hydrolases from a Switchgrass-Adapted Compost Community

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, Amitha; Allgaier, Martin; Park, Joshua I.; Ivanoval, Natalia; Dhaeseleer, Patrik; Lowry, Steve; Sapra, Rajat; Hazen, Terry C.; Simmons, Blake A.; VanderGheynst, Jean S.; Hugenholtz, Philip

    2011-05-11

    Development of cellulosic biofuels from non-food crops is currently an area of intense research interest. Tailoring depolymerizing enzymes to particular feedstocks and pretreatment conditions is one promising avenue of research in this area. Here we added a green-waste compost inoculum to switchgrass (Panicum virgatum) and simulated thermophilic composting in a bioreactor to select for a switchgrass-adapted community and to facilitate targeted discovery of glycoside hydrolases. Smallsubunit (SSU) rRNA-based community profiles revealed that the microbial community changed dramatically between the initial and switchgrass-adapted compost (SAC) with some bacterial populations being enriched over 20-fold. We obtained 225 Mbp of 454-titanium pyrosequence data from the SAC community and conservatively identified 800 genes encoding glycoside hydrolase domains that were biased toward depolymerizing grass cell wall components. Of these, ,10percent were putative cellulasesmostly belonging to families GH5 and GH9. We synthesized two SAC GH9 genes with codon optimization for heterologous expression in Escherichia coli and observed activity for one on carboxymethyl cellulose. The active GH9 enzyme has a temperature optimum of 50uC and pH range of 5.5 to 8 consistent with the composting conditions applied. We demonstrate that microbial communities adapt to switchgrass decomposition using simulated composting condition and that full-length genes can be identified from complex metagenomic sequence data, synthesized and expressed resulting in active enzyme.

  10. Orphan diseases: state of the drug discovery art.

    Science.gov (United States)

    Volmar, Claude-Henry; Wahlestedt, Claes; Brothers, Shaun P

    2017-06-01

    Since 1983 more than 300 drugs have been developed and approved for orphan diseases. However, considering the development of novel diagnosis tools, the number of rare diseases vastly outpaces therapeutic discovery. Academic centers and nonprofit institutes are now at the forefront of rare disease R&D, partnering with pharmaceutical companies when academic researchers discover novel drugs or targets for specific diseases, thus reducing the failure risk and cost for pharmaceutical companies. Considerable progress has occurred in the art of orphan drug discovery, and a symbiotic relationship now exists between pharmaceutical industry, academia, and philanthropists that provides a useful framework for orphan disease therapeutic discovery. Here, the current state-of-the-art of drug discovery for orphan diseases is reviewed. Current technological approaches and challenges for drug discovery are considered, some of which can present somewhat unique challenges and opportunities in orphan diseases, including the potential for personalized medicine, gene therapy, and phenotypic screening.

  11. A Cbx8-containing polycomb complex facilitates the transition to gene activation during ES cell differentiation.

    Directory of Open Access Journals (Sweden)

    Catherine Creppe

    2014-12-01

    Full Text Available Polycomb proteins play an essential role in maintaining the repression of developmental genes in self-renewing embryonic stem cells. The exact mechanism allowing the derepression of polycomb target genes during cell differentiation remains unclear. Our project aimed to identify Cbx8 binding sites in differentiating mouse embryonic stem cells. Therefore, we used a genome-wide chromatin immunoprecipitation of endogenous Cbx8 coupled to direct massive parallel sequencing (ChIP-Seq. Our analysis identified 171 high confidence peaks. By crossing our data with previously published microarray analysis, we show that several differentiation genes transiently recruit Cbx8 during their early activation. Depletion of Cbx8 partially impairs the transcriptional activation of these genes. Both interaction analysis, as well as chromatin immunoprecipitation experiments support the idea that activating Cbx8 acts in the context of an intact PRC1 complex. Prolonged gene activation results in eviction of PRC1 despite persisting H3K27me3 and H2A ubiquitination. The composition of PRC1 is highly modular and changes when embryonic stem cells commit to differentiation. We further demonstrate that the exchange of Cbx7 for Cbx8 is required for the effective activation of differentiation genes. Taken together, our results establish a function for a Cbx8-containing complex in facilitating the transition from a Polycomb-repressed chromatin state to an active state. As this affects several key regulatory differentiation genes this mechanism is likely to contribute to the robust execution of differentiation programs.

  12. Gene2Function: An Integrated Online Resource for Gene Function Discovery

    Directory of Open Access Journals (Sweden)

    Yanhui Hu

    2017-08-01

    Full Text Available One of the most powerful ways to develop hypotheses regarding the biological functions of conserved genes in a given species, such as humans, is to first look at what is known about their function in another species. Model organism databases and other resources are rich with functional information but difficult to mine. Gene2Function addresses a broad need by integrating information about conserved genes in a single online resource.

  13. Leveraging gene-environment interactions and endotypes for asthma gene discovery

    DEFF Research Database (Denmark)

    Bønnelykke, Klaus; Ober, Carole

    2016-01-01

    , such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs......Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also...... heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes...

  14. Gene expression profiling leads to discovery of correlation of matrix metalloproteinase 11 and heparanase 2 in breast cancer progression

    International Nuclear Information System (INIS)

    Fu, Junjie; Khaybullin, Ravil; Zhang, Yanping; Xia, Amy; Qi, Xin

    2015-01-01

    In order to identify biomarkers involved in breast cancer, gene expression profiling was conducted using human breast cancer tissues. Total RNAs were extracted from 150 clinical patient tissues covering three breast cancer subtypes (Luminal A, Luminal B, and Triple negative) as well as normal tissues. The expression profiles of a total of 50,739 genes were established from a training set of 32 samples using the Agilent Sure Print G3 Human Gene Expression Microarray technology. Data were analyzed using Agilent Gene Spring GX 12.6 software. The expression of several genes was validated using real-time RT-qPCR. Data analysis with Agilent GeneSpring GX 12.6 software showed distinct expression patterns between cancer and normal tissue samples. A group of 28 promising genes were identified with ≥ 10-fold changes of expression level and p-values < 0.05. In particular, MMP11 and HPSE2 were closely examined due to the important roles they play in cancer cell growth and migration. Real-time RT-qPCR analyses of both training and testing sets validated the gene expression profiles of MMP11 and HPSE2. Our findings identified these 2 genes as a novel breast cancer biomarker gene set, which may facilitate the diagnosis and treatment in breast cancer clinical therapies

  15. A novel algorithm for simplification of complex gene classifiers in cancer

    Science.gov (United States)

    Wilson, Raphael A.; Teng, Ling; Bachmeyer, Karen M.; Bissonnette, Mei Lin Z.; Husain, Aliya N.; Parham, David M.; Triche, Timothy J.; Wing, Michele R.; Gastier-Foster, Julie M.; Barr, Frederic G.; Hawkins, Douglas S.; Anderson, James R.; Skapek, Stephen X.; Volchenboum, Samuel L.

    2013-01-01

    The clinical application of complex molecular classifiers as diagnostic or prognostic tools has been limited by the time and cost needed to apply them to patients. Using an existing fifty-gene expression signature known to separate two molecular subtypes of the pediatric cancer rhabdomyosarcoma, we show that an exhaustive iterative search algorithm can distill this complex classifier down to two or three features with equal discrimination. We validated the two-gene signatures using three separate and distinct data sets, including one that uses degraded RNA extracted from formalin-fixed, paraffin-embedded material. Finally, to demonstrate the generalizability of our algorithm, we applied it to a lung cancer data set to find minimal gene signatures that can distinguish survival. Our approach can easily be generalized and coupled to existing technical platforms to facilitate the discovery of simplified signatures that are ready for routine clinical use. PMID:23913937

  16. Text mining-based in silico drug discovery in oral mucositis caused by high-dose cancer therapy.

    Science.gov (United States)

    Kirk, Jon; Shah, Nirav; Noll, Braxton; Stevens, Craig B; Lawler, Marshall; Mougeot, Farah B; Mougeot, Jean-Luc C

    2018-08-01

    Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways. OM and wound healing-associated genes were determined by text mining, and the intersection of the two gene sets was selected for gene ontology analysis using the GeneCodis program. Protein interaction network analysis was performed using STRING-db. Enriched gene sets belonging to the identified pathways were queried against the Drug-Gene Interaction database to find drug candidates for topical use in OM. Our analysis identified 447 genes common to both the "OM" and "wound healing" text mining concepts. Gene enrichment analysis yielded 20 genes representing six pathways and targetable by a total of 32 drugs which could possibly be formulated for topical application. A manual search on ClinicalTrials.gov confirmed no relevant pathway/drug candidate had been overlooked. Twenty-five of the 32 drugs can directly affect the PTGS2 (COX-2) pathway, the pathway that has been targeted in previous clinical trials with limited success. Drug discovery using in silico text mining and pathway analysis tools can facilitate the identification of existing drugs that have the potential of topical administration to improve OM treatment.

  17. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    Full Text Available Abstract Background Pigeonpea (Cajanus cajan (L. Millsp is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW, sterility mosaic disease (SMD, etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs. Results A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376' and SMD ('ICP 7035' and 'TTB 7' and a total of 9,888 (9,468 high quality ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%, rice (28.3%, Arabidopsis (33.7% and poplar (35.4%. As expected, pigeonpea ESTs are more closely related to soybean (60.3% and cowpea ESTs (43.6% than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1% out of 4,557 total unigenes correspond to known proteins in the UniProt database (≤ 1E-08. Functional categorization of the annotated unigenes sequences showed that 153 (3.3% genes were assigned to cellular component category, 132 (2.8% to biological process, and 132 (2

  18. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes

    Science.gov (United States)

    Tong, Ying; Zhang, Yang; Huang, Jiaomei; Xiao, Shu; Zhang, Yuehuan; Li, Jun; Chen, Jinhui; Yu, Ziniu

    2015-01-01

    Background The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs. Results The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.). Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs) and 1,699 simple sequence repeats (SSRs) were compiled. Conclusions Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research

  19. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes.

    Directory of Open Access Journals (Sweden)

    Ying Tong

    Full Text Available The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs.The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.. Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs and 1,699 simple sequence repeats (SSRs were compiled.Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research on bivalve

  20. Mass spectrometry-driven drug discovery for development of herbal medicine.

    Science.gov (United States)

    Zhang, Aihua; Sun, Hui; Wang, Xijun

    2018-05-01

    Herbal medicine (HM) has made a major contribution to the drug discovery process with regard to identifying products compounds. Currently, more attention has been focused on drug discovery from natural compounds of HM. Despite the rapid advancement of modern analytical techniques, drug discovery is still a difficult and lengthy process. Fortunately, mass spectrometry (MS) can provide us with useful structural information for drug discovery, has been recognized as a sensitive, rapid, and high-throughput technology for advancing drug discovery from HM in the post-genomic era. It is essential to develop an efficient, high-quality, high-throughput screening method integrated with an MS platform for early screening of candidate drug molecules from natural products. We have developed a new chinmedomics strategy reliant on MS that is capable of capturing the candidate molecules, facilitating their identification of novel chemical structures in the early phase; chinmedomics-guided natural product discovery based on MS may provide an effective tool that addresses challenges in early screening of effective constituents of herbs against disease. This critical review covers the use of MS with related techniques and methodologies for natural product discovery, biomarker identification, and determination of mechanisms of action. It also highlights high-throughput chinmedomics screening methods suitable for lead compound discovery illustrated by recent successes. © 2016 Wiley Periodicals, Inc.

  1. Generation of cell lines for drug discovery through random activation of gene expression: application to the human histamine H3 receptor.

    Science.gov (United States)

    Song, J; Doucette, C; Hanniford, D; Hunady, K; Wang, N; Sherf, B; Harrington, J J; Brunden, K R; Stricker-Krongrad, A

    2005-06-01

    Target-based high-throughput screening (HTS) plays an integral role in drug discovery. The implementation of HTS assays generally requires high expression levels of the target protein, and this is typically accomplished using recombinant cDNA methodologies. However, the isolated gene sequences to many drug targets have intellectual property claims that restrict the ability to implement drug discovery programs. The present study describes the pharmacological characterization of the human histamine H3 receptor that was expressed using random activation of gene expression (RAGE), a technology that over-expresses proteins by up-regulating endogenous genes rather than introducing cDNA expression vectors into the cell. Saturation binding analysis using [125I]iodoproxyfan and RAGE-H3 membranes revealed a single class of binding sites with a K(D) value of 0.77 nM and a B(max) equal to 756 fmol/mg of protein. Competition binding studies showed that the rank order of potency for H3 agonists was N(alpha)-methylhistamine approximately (R)-alpha- methylhistamine > histamine and that the rank order of potency for H3 antagonists was clobenpropit > iodophenpropit > thioperamide. The same rank order of potency for H3 agonists and antagonists was observed in the functional assays as in the binding assays. The Fluorometic Imaging Plate Reader assays in RAGE-H3 cells gave high Z' values for agonist and antagonist screening, respectively. These results reveal that the human H3 receptor expressed with the RAGE technology is pharmacologically comparable to that expressed through recombinant methods. Moreover, the level of expression of the H3 receptor in the RAGE-H3 cells is suitable for HTS and secondary assays.

  2. Output ordering and prioritisation system (OOPS): ranking biosynthetic gene clusters to enhance bioactive metabolite discovery.

    Science.gov (United States)

    Peña, Alejandro; Del Carratore, Francesco; Cummings, Matthew; Takano, Eriko; Breitling, Rainer

    2017-12-18

    The rapid increase of publicly available microbial genome sequences has highlighted the presence of hundreds of thousands of biosynthetic gene clusters (BGCs) encoding valuable secondary metabolites. The experimental characterization of new BGCs is extremely laborious and struggles to keep pace with the in silico identification of potential BGCs. Therefore, the prioritisation of promising candidates among computationally predicted BGCs represents a pressing need. Here, we propose an output ordering and prioritisation system (OOPS) which helps sorting identified BGCs by a wide variety of custom-weighted biological and biochemical criteria in a flexible and user-friendly interface. OOPS facilitates a judicious prioritisation of BGCs using G+C content, coding sequence length, gene number, cluster self-similarity and codon bias parameters, as well as enabling the user to rank BGCs based upon BGC type, novelty, and taxonomic distribution. Effective prioritisation of BGCs will help to reduce experimental attrition rates and improve the breadth of bioactive metabolites characterized.

  3. Oral Gram-negative anaerobic bacilli as a reservoir of β-lactam resistance genes facilitating infections with multiresistant bacteria.

    Science.gov (United States)

    Dupin, Clarisse; Tamanai-Shacoori, Zohreh; Ehrmann, Elodie; Dupont, Anais; Barloy-Hubler, Frédérique; Bousarghin, Latifa; Bonnaure-Mallet, Martine; Jolivet-Gougeon, Anne

    2015-02-01

    Many β-lactamases have been described in various Gram-negative bacilli (Capnocytophaga, Prevotella, Fusobacterium, etc.) of the oral cavity, belonging to class A of the Ambler classification (CepA, CblA, CfxA, CSP-1 and TEM), class B (CfiA) or class D in Fusobacterium nucleatum (FUS-1). The minimum inhibitory concentrations of β-lactams are variable and this variation is often related to the presence of plasmids or other mobile genetic elements (MGEs) that modulate the expression of resistance genes. DNA persistence and bacterial promiscuity in oral biofilms also contribute to genetic transformation and conjugation in this particular microcosm. Overexpression of efflux pumps is facilitated because the encoding genes are located on MGEs, in some multidrug-resistant clinical isolates, similar to conjugative transposons harbouring genes encoding β-lactamases. All these facts lead us to consider the oral cavity as an important reservoir of β-lactam resistance genes and a privileged place for genetic exchange, especially in commensal strictly anaerobic Gram-negative bacilli. Copyright © 2014 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.

  4. Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing.

    Science.gov (United States)

    Pang, Chi Nam Ignatius; Tay, Aidan P; Aya, Carlos; Twine, Natalie A; Harkness, Linda; Hart-Smith, Gene; Chia, Samantha Z; Chen, Zhiliang; Deshpande, Nandan P; Kaakoush, Nadeem O; Mitchell, Hazel M; Kassem, Moustapha; Wilkins, Marc R

    2014-01-03

    Direct links between proteomic and genomic/transcriptomic data are not frequently made, partly because of lack of appropriate bioinformatics tools. To help address this, we have developed the PG Nexus pipeline. The PG Nexus allows users to covisualize peptides in the context of genomes or genomic contigs, along with RNA-seq reads. This is done in the Integrated Genome Viewer (IGV). A Results Analyzer reports the precise base position where LC-MS/MS-derived peptides cover genes or gene isoforms, on the chromosomes or contigs where this occurs. In prokaryotes, the PG Nexus pipeline facilitates the validation of genes, where annotation or gene prediction is available, or the discovery of genes using a "virtual protein"-based unbiased approach. We illustrate this with a comprehensive proteogenomics analysis of two strains of Campylobacter concisus . For higher eukaryotes, the PG Nexus facilitates gene validation and supports the identification of mRNA splice junction boundaries and splice variants that are protein-coding. This is illustrated with an analysis of splice junctions covered by human phosphopeptides, and other examples of relevance to the Chromosome-Centric Human Proteome Project. The PG Nexus is open-source and available from https://github.com/IntersectAustralia/ap11_Samifier. It has been integrated into Galaxy and made available in the Galaxy tool shed.

  5. An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer

    Directory of Open Access Journals (Sweden)

    Lockwood William W

    2010-05-01

    Full Text Available Abstract Background Genomics has substantially changed our approach to cancer research. Gene expression profiling, for example, has been utilized to delineate subtypes of cancer, and facilitated derivation of predictive and prognostic signatures. The emergence of technologies for the high resolution and genome-wide description of genetic and epigenetic features has enabled the identification of a multitude of causal DNA events in tumors. This has afforded the potential for large scale integration of genome and transcriptome data generated from a variety of technology platforms to acquire a better understanding of cancer. Results Here we show how multi-dimensional genomics data analysis would enable the deciphering of mechanisms that disrupt regulatory/signaling cascades and downstream effects. Since not all gene expression changes observed in a tumor are causal to cancer development, we demonstrate an approach based on multiple concerted disruption (MCD analysis of genes that facilitates the rational deduction of aberrant genes and pathways, which otherwise would be overlooked in single genomic dimension investigations. Conclusions Notably, this is the first comprehensive study of breast cancer cells by parallel integrative genome wide analyses of DNA copy number, LOH, and DNA methylation status to interpret changes in gene expression pattern. Our findings demonstrate the power of a multi-dimensional approach to elucidate events which would escape conventional single dimensional analysis and as such, reduce the cohort sample size for cancer gene discovery.

  6. Metadata Effectiveness in Internet Discovery: An Analysis of Digital Collection Metadata Elements and Internet Search Engine Keywords

    Science.gov (United States)

    Yang, Le

    2016-01-01

    This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…

  7. Facilitating NCAR Data Discovery by Connecting Related Resources

    Science.gov (United States)

    Rosati, A.

    2012-12-01

    Linking datasets, creators, and users by employing the proper standards helps to increase the impact of funded research. In order for users to find a dataset, it must first be named. Data citations play the important role of giving datasets a persistent presence by assigning a formal "name" and location. This project focuses on the next step of the "name-find-use" sequence: enhancing discoverability of NCAR data by connecting related resources on the web. By examining metadata schemas that document datasets, I examined how Semantic Web approaches can help to ensure the widest possible range of data users. The focus was to move from search engine optimization (SEO) to information connectivity. Two main markup types are very visible in the Semantic Web and applicable to scientific dataset discovery: The Open Archives Initiative-Object Reuse and Exchange (OAI-ORE - www.openarchives.org) and Microdata (HTML5 and www.schema.org). My project creates pilot aggregations of related resources using both markup types for three case studies: The North American Regional Climate Change Assessment Program (NARCCAP) dataset and related publications, the Palmer Drought Severity Index (PSDI) animation and image files from NCAR's Visualization Lab (VisLab), and the multidisciplinary data types and formats from the Advanced Cooperative Arctic Data and Information Service (ACADIS). This project documents the differences between these markups and how each creates connectedness on the web. My recommendations point toward the most efficient and effective markup schema for aggregating resources within the three case studies based on the following assessment criteria: ease of use, current state of support and adoption of technology, integration with typical web tools, available vocabularies and geoinformatic standards, interoperability with current repositories and access portals (e.g. ESG, Java), and relation to data citation tools and methods.

  8. Translational Research 2.0: a framework for accelerating collaborative discovery.

    Science.gov (United States)

    Asakiewicz, Chris

    2014-05-01

    The world wide web has revolutionized the conduct of global, cross-disciplinary research. In the life sciences, interdisciplinary approaches to problem solving and collaboration are becoming increasingly important in facilitating knowledge discovery and integration. Web 2.0 technologies promise to have a profound impact - enabling reproducibility, aiding in discovery, and accelerating and transforming medical and healthcare research across the healthcare ecosystem. However, knowledge integration and discovery require a consistent foundation upon which to operate. A foundation should be capable of addressing some of the critical issues associated with how research is conducted within the ecosystem today and how it should be conducted for the future. This article will discuss a framework for enhancing collaborative knowledge discovery across the medical and healthcare research ecosystem. A framework that could serve as a foundation upon which ecosystem stakeholders can enhance the way data, information and knowledge is created, shared and used to accelerate the translation of knowledge from one area of the ecosystem to another.

  9. SNP Discovery for mapping alien introgressions in wheat

    Science.gov (United States)

    2014-01-01

    Background Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). Results The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. Conclusion This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and

  10. SNP Discovery for mapping alien introgressions in wheat.

    Science.gov (United States)

    Tiwari, Vijay K; Wang, Shichen; Sehgal, Sunish; Vrána, Jan; Friebe, Bernd; Kubaláková, Marie; Chhuneja, Praveen; Doležel, Jaroslav; Akhunov, Eduard; Kalia, Bhanu; Sabir, Jamal; Gill, Bikram S

    2014-04-10

    Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and monitoring of alien segments in crop

  11. Use of arbitrary DNA primers, polyacrylamide gel electrophoresis and silver staining for identity testing, gene discovery and analysis of gene expression

    International Nuclear Information System (INIS)

    Gresshoff, P.

    1998-01-01

    To understand chemically-induced genomic differences in soybean mutants differing in their ability to enter the nitrogen-fixing symbiosis involving Bradyrhizobium japonicum, molecular techniques were developed to aid the map-based, or positional, cloning. DNA marker technology involving single arbitrary primers was used to enrich regional RFLP linkage data. Molecular techniques, including two-dimensional pulse field gel electrophoresis, were developed to ascertain the first physical mapping in soybean, leading to the conclusion that in the region of marker pA-36 on linkage group H, 1 cM equals about 500 cM. High molecular weight DNA was isolated and cloned into yeast or bacterial artificial chromosomes (YACs/ BACs). YACs were used to analyze soybean genome structure, revealing that over half of the genome contains repetitive DNA. Genetic and molecular tools are now available to facilitate the isolation of plant genes directly involved in symbiosis. The further characterization of these genes, along with the determination of the mechanisms that lead to the mutation, will be of value to other plants and induced mutation research. (author)

  12. Challenges of the information age: the impact of false discovery on pathway identification.

    Science.gov (United States)

    Rog, Colin J; Chekuri, Srinivasa C; Edgerton, Mary E

    2012-11-21

    Pathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored. The frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene. This can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.

  13. Targeted discovery of glycoside hydrolases from a switchgrass-adapted compost community.

    Directory of Open Access Journals (Sweden)

    Martin Allgaier

    Full Text Available Development of cellulosic biofuels from non-food crops is currently an area of intense research interest. Tailoring depolymerizing enzymes to particular feedstocks and pretreatment conditions is one promising avenue of research in this area. Here we added a green-waste compost inoculum to switchgrass (Panicum virgatum and simulated thermophilic composting in a bioreactor to select for a switchgrass-adapted community and to facilitate targeted discovery of glycoside hydrolases. Small-subunit (SSU rRNA-based community profiles revealed that the microbial community changed dramatically between the initial and switchgrass-adapted compost (SAC with some bacterial populations being enriched over 20-fold. We obtained 225 Mbp of 454-titanium pyrosequence data from the SAC community and conservatively identified 800 genes encoding glycoside hydrolase domains that were biased toward depolymerizing grass cell wall components. Of these, approximately 10% were putative cellulases mostly belonging to families GH5 and GH9. We synthesized two SAC GH9 genes with codon optimization for heterologous expression in Escherichia coli and observed activity for one on carboxymethyl cellulose. The active GH9 enzyme has a temperature optimum of 50 degrees C and pH range of 5.5 to 8 consistent with the composting conditions applied. We demonstrate that microbial communities adapt to switchgrass decomposition using simulated composting condition and that full-length genes can be identified from complex metagenomic sequence data, synthesized and expressed resulting in active enzyme.

  14. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    Science.gov (United States)

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  15. Dichotomy of major genes and polygenes

    International Nuclear Information System (INIS)

    Jain, S.

    1989-01-01

    In order to facilitate domestication and breeding of new or underexploited crop species, the genetic basis of many traits must be critically investigated, and both naturally occurring and induced mutations should be utilized. Classically, most breeding procedures have invoked the dichotomy of major genes versus polygenes (or discrete versus continuously varying traits) which is briefly reviewed here from several viewpoints. Clearly, the evidence for two distinct classes of genes (or gene effects on phenotype) and traits is largely a product of different forms of genetic analyses and their primary objectives as well as of researchers' expectations. Superimposed on the simplest Mendelian ratios and genome maps are numerous sources of molecular variation and gene expression at many levels of phenotypic description. Many attempts to delineate developmental pathways and to identify genes controlling discrete vs. quantitative phenotypic variation have resulted in emphasis on multigenic models with specific gene effects at mappable loci but nonetheless modified by small effects. Thus, quantitative genetic variation may arise from multi-genic and multi-allelic systems of both structural and regulatory gene action and gene interactions which, from an empirical breeding perspective, might be adequately described by the biometrical and evolutionary models. Polygenic analyses were conceptually based on genetic parameters in these models (as caricatures of reality) but efforts to modify or reject them by identifying and mapping sources of phenotypic variation through newer genetic methods are likely to enrich and not displace biometrical methods. Domestication programmes, in particular, should employ the entire array of genetic discoveries and methodologies. (author). 71 refs, 1 fig., 1 tab

  16. High-throughput genotyping-by-sequencing facilitates molecular tagging of a novel rust resistance gene, R 15 , in sunflower (Helianthus annuus L.).

    Science.gov (United States)

    Ma, G J; Song, Q J; Markell, S G; Qi, L L

    2018-03-21

    A novel rust resistance gene, R 15 , derived from the cultivated sunflower HA-R8 was assigned to linkage group 8 of the sunflower genome using a genotyping-by-sequencing approach. SNP markers closely linked to R 15 were identified, facilitating marker-assisted selection of resistance genes. The rust virulence gene is co-evolving with the resistance gene in sunflower, leading to the emergence of new physiologic pathotypes. This presents a continuous threat to the sunflower crop necessitating the development of resistant sunflower hybrids providing a more efficient, durable, and environmentally friendly host plant resistance. The inbred line HA-R8 carries a gene conferring resistance to all known races of the rust pathogen in North America and can be used as a broad-spectrum resistance resource. Based on phenotypic assessments of 140 F 2 individuals derived from a cross of HA 89 with HA-R8, rust resistance in the population was found to be conferred by a single dominant gene (R 15 ) originating from HA-R8. Genotypic analysis with the currently available SSR markers failed to find any association between rust resistance and any markers. Therefore, we used genotyping-by-sequencing (GBS) analysis to achieve better genomic coverage. The GBS data showed that R 15 was located at the top end of linkage group (LG) 8. Saturation with 71 previously mapped SNP markers selected within this region further showed that it was located in a resistance gene cluster on LG8, and mapped to a 1.0-cM region between three co-segregating SNP makers SFW01920, SFW00128, and SFW05824 as well as the NSA_008457 SNP marker. These closely linked markers will facilitate marker-assisted selection and breeding in sunflower.

  17. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  18. cudaMap: a GPU accelerated program for gene expression connectivity mapping.

    Science.gov (United States)

    McArt, Darragh G; Bankhead, Peter; Dunne, Philip D; Salto-Tellez, Manuel; Hamilton, Peter; Zhang, Shu-Dong

    2013-10-11

    Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take > 2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping. cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance. Emerging 'omics' technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap.

  19. Antisense gene silencing

    DEFF Research Database (Denmark)

    Nielsen, Troels T; Nielsen, Jørgen E

    2013-01-01

    Since the first reports that double-stranded RNAs can efficiently silence gene expression in C. elegans, the technology of RNA interference (RNAi) has been intensively exploited as an experimental tool to study gene function. With the subsequent discovery that RNAi could also be applied...

  20. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  1. Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes

    Directory of Open Access Journals (Sweden)

    Kiyomi Abe

    2016-09-01

    Full Text Available Identification and elucidation of functions of plant genes is valuable for both basic and applied research. In addition to natural variation in model plants, numerous loss-of-function resources have been produced by mutagenesis with chemicals, irradiation, or insertions of transposable elements or T-DNA. However, we may be unable to observe loss-of-function phenotypes for genes with functionally redundant homologs, and for those essential for growth and development. To offset such disadvantages, gain-of-function transgenic resources have been exploited. Activation-tagged lines have been generated using obligatory overexpression of endogenous genes by random insertion of an enhancer. Recent progress in DNA sequencing technology and bioinformatics has enabled the preparation of genomewide collections of full-length cDNAs (fl-cDNAs in some model species. Using the fl-cDNA clones, a novel gain-of-function strategy, Fl-cDNA OvereXpressor gene (FOX-hunting system, has been developed. A mutant phenotype in a FOX line can be directly attributed to the overexpressed fl-cDNA. Investigating a large population of FOX lines could reveal important genes conferring favorable phenotypes for crop breeding. Alternatively, a unique loss-of-function approach Chimeric REpressor gene Silencing Technology (CRES-T has been developed. In CRES-T, overexpression of a chimeric repressor, composed of the coding sequence of a transcription factor (TF and short peptide designated as the repression domain, could interfere with the action of endogenous TF in plants. Although plant TFs usually consist of gene families, CRES-T is effective, in principle, even for the TFs with functional redundancy. In this review, we focus on the current status of the gene-overexpression strategies and resources for identifying and elucidating novel functions of cereal genes. We discuss the potential of these research tools for identifying useful genes and phenotypes for application in crop

  2. Accelerating scientific discovery : 2007 annual report.

    Energy Technology Data Exchange (ETDEWEB)

    Beckman, P.; Dave, P.; Drugan, C.

    2008-11-14

    As a gateway for scientific discovery, the Argonne Leadership Computing Facility (ALCF) works hand in hand with the world's best computational scientists to advance research in a diverse span of scientific domains, ranging from chemistry, applied mathematics, and materials science to engineering physics and life sciences. Sponsored by the U.S. Department of Energy's (DOE) Office of Science, researchers are using the IBM Blue Gene/L supercomputer at the ALCF to study and explore key scientific problems that underlie important challenges facing our society. For instance, a research team at the University of California-San Diego/ SDSC is studying the molecular basis of Parkinson's disease. The researchers plan to use the knowledge they gain to discover new drugs to treat the disease and to identify risk factors for other diseases that are equally prevalent. Likewise, scientists from Pratt & Whitney are using the Blue Gene to understand the complex processes within aircraft engines. Expanding our understanding of jet engine combustors is the secret to improved fuel efficiency and reduced emissions. Lessons learned from the scientific simulations of jet engine combustors have already led Pratt & Whitney to newer designs with unprecedented reductions in emissions, noise, and cost of ownership. ALCF staff members provide in-depth expertise and assistance to those using the Blue Gene/L and optimizing user applications. Both the Catalyst and Applications Performance Engineering and Data Analytics (APEDA) teams support the users projects. In addition to working with scientists running experiments on the Blue Gene/L, we have become a nexus for the broader global community. In partnership with the Mathematics and Computer Science Division at Argonne National Laboratory, we have created an environment where the world's most challenging computational science problems can be addressed. Our expertise in high-end scientific computing enables us to provide

  3. Identification of candidate genes for human pituitary development by EST analysis

    Directory of Open Access Journals (Sweden)

    Xiao Huasheng

    2009-03-01

    Full Text Available Abstract Background The pituitary is a critical neuroendocrine gland that is comprised of five hormone-secreting cell types, which develops in tandem during the embryonic stage. Some essential genes have been identified in the early stage of adenohypophysial development, such as PITX1, FGF8, BMP4 and SF-1. However, it is likely that a large number of signaling molecules and transcription factors essential for determination and terminal differentiation of specific cell types remain unidentified. High-throughput methods such as microarray analysis may facilitate the measurement of gene transcriptional levels, while Expressed sequence tag (EST sequencing, an efficient method for gene discovery and expression level analysis, may no-redundantly help to understand gene expression patterns during development. Results A total of 9,271 ESTs were generated from both fetal and adult pituitaries, and assigned into 961 gene/EST clusters in fetal and 2,747 in adult pituitary by homology analysis. The transcription maps derived from these data indicated that developmentally relevant genes, such as Sox4, ST13 and ZNF185, were dominant in the cDNA library of fetal pituitary, while hormones and hormone-associated genes, such as GH1, GH2, POMC, LHβ, CHGA and CHGB, were dominant in adult pituitary. Furthermore, by using RT-PCR and in situ hybridization, Sox4 was found to be one of the main transcription factors expressed in fetal pituitary for the first time. It was expressed at least at E12.5, but decreased after E17.5. In addition, 40 novel ESTs were identified specifically in this tissue. Conclusion The significant changes in gene expression in both tissues suggest a distinct and dynamic switch between embryonic and adult pituitaries. All these data along with Sox4 should be confirmed to further understand the community of multiple signaling pathways that act as a cooperative network that regulates maturation of the pituitary. It was also suggested that EST

  4. Computational Identification of the Paralogs and Orthologs of Human Cytochrome P450 Superfamily and the Implication in Drug Discovery

    Directory of Open Access Journals (Sweden)

    Shu-Ting Pan

    2016-06-01

    Full Text Available The human cytochrome P450 (CYP superfamily consisting of 57 functional genes is the most important group of Phase I drug metabolizing enzymes that oxidize a large number of xenobiotics and endogenous compounds, including therapeutic drugs and environmental toxicants. The CYP superfamily has been shown to expand itself through gene duplication, and some of them become pseudogenes due to gene mutations. Orthologs and paralogs are homologous genes resulting from speciation or duplication, respectively. To explore the evolutionary and functional relationships of human CYPs, we conducted this bioinformatic study to identify their corresponding paralogs, homologs, and orthologs. The functional implications and implications in drug discovery and evolutionary biology were then discussed. GeneCards and Ensembl were used to identify the paralogs of human CYPs. We have used a panel of online databases to identify the orthologs of human CYP genes: NCBI, Ensembl Compara, GeneCards, OMA (“Orthologous MAtrix” Browser, PATHER, TreeFam, EggNOG, and Roundup. The results show that each human CYP has various numbers of paralogs and orthologs using GeneCards and Ensembl. For example, the paralogs of CYP2A6 include CYP2A7, 2A13, 2B6, 2C8, 2C9, 2C18, 2C19, 2D6, 2E1, 2F1, 2J2, 2R1, 2S1, 2U1, and 2W1; CYP11A1 has 6 paralogs including CYP11B1, 11B2, 24A1, 27A1, 27B1, and 27C1; CYP51A1 has only three paralogs: CYP26A1, 26B1, and 26C1; while CYP20A1 has no paralog. The majority of human CYPs are well conserved from plants, amphibians, fishes, or mammals to humans due to their important functions in physiology and xenobiotic disposition. The data from different approaches are also cross-validated and validated when experimental data are available. These findings facilitate our understanding of the evolutionary relationships and functional implications of the human CYP superfamily in drug discovery.

  5. Gene Discovery through Genomic Sequencing of Brucella abortus

    Science.gov (United States)

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  6. The Matchmaker Exchange: a platform for rare disease gene discovery

    NARCIS (Netherlands)

    Philippakis, A.A.; Azzariti, D.R.; Beltran, S.; Brookes, A.J.; Brownstein, C.A.; Brudno, M.; Brunner, H.G.; Buske, O.J.; Carey, K.; Doll, C.; Dumitriu, S.; Dyke, S.O.M.; Dunnen, J.T. den; Firth, H.V.; Gibbs, R.A.; Girdea, M.; Gonzalez, M.; Haendel, M.A.; Hamosh, A.; Holm, I.A.; Huang, L.; Hurles, M.E.; Hutton, B.; Krier, J.B.; Misyura, A.; Mungall, C.J.; Paschall, J.; Paten, B.; Robinson, P.N.; Schiettecatte, F.; Sobreira, N.L.; Swaminathan, G.J.; Taschner, P.E.M.; Terry, S.F.; Washington, N.L.; Zuchner, S.; Boycott, K.M.; Rehm, H.L.

    2015-01-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of

  7. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    Science.gov (United States)

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  8. Cracking the regulatory code of biosynthetic gene clusters as a strategy for natural product discovery.

    Science.gov (United States)

    Rigali, Sébastien; Anderssen, Sinaeda; Naômé, Aymeric; van Wezel, Gilles P

    2018-01-05

    The World Health Organization (WHO) describes antibiotic resistance as "one of the biggest threats to global health, food security, and development today", as the number of multi- and pan-resistant bacteria is rising dangerously. Acquired resistance phenomena also impair antifungals, antivirals, anti-cancer drug therapy, while herbicide resistance in weeds threatens the crop industry. On the positive side, it is likely that the chemical space of natural products goes far beyond what has currently been discovered. This idea is fueled by genome sequencing of microorganisms which unveiled numerous so-called cryptic biosynthetic gene clusters (BGCs), many of which are transcriptionally silent under laboratory culture conditions, and by the fact that most bacteria cannot yet be cultivated in the laboratory. However, brute force antibiotic discovery does not yield the same results as it did in the past, and researchers have had to develop creative strategies in order to unravel the hidden potential of microorganisms such as Streptomyces and other antibiotic-producing microorganisms. Identifying the cis elements and their corresponding transcription factors(s) involved in the control of BGCs through bioinformatic approaches is a promising strategy. Theoretically, we are a few 'clicks' away from unveiling the culturing conditions or genetic changes needed to activate the production of cryptic metabolites or increase the production yield of known compounds to make them economically viable. In this opinion article, we describe and illustrate the idea beyond 'cracking' the regulatory code for natural product discovery, by presenting a series of proofs of concept, and discuss what still should be achieved to increase the rate of success of this strategy. Copyright © 2018 Elsevier Inc. All rights reserved.

  9. Comprehensive annotation of secondary metabolite biosynthetic genes and gene clusters of Aspergillus nidulans, A. fumigatus, A. niger and A. oryzae

    Science.gov (United States)

    2013-01-01

    Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571

  10. Using the iPlant collaborative discovery environment.

    Science.gov (United States)

    Oliver, Shannon L; Lenards, Andrew J; Barthelson, Roger A; Merchant, Nirav; McKay, Sheldon J

    2013-06-01

    The iPlant Collaborative is an academic consortium whose mission is to develop an informatics and social infrastructure to address the "grand challenges" in plant biology. Its cyberinfrastructure supports the computational needs of the research community and facilitates solving major challenges in plant science. The Discovery Environment provides a powerful and rich graphical interface to the iPlant Collaborative cyberinfrastructure by creating an accessible virtual workbench that enables all levels of expertise, ranging from students to traditional biology researchers and computational experts, to explore, analyze, and share their data. By providing access to iPlant's robust data-management system and high-performance computing resources, the Discovery Environment also creates a unified space in which researchers can access scalable tools. Researchers can use available Applications (Apps) to execute analyses on their data, as well as customize or integrate their own tools to better meet the specific needs of their research. These Apps can also be used in workflows that automate more complicated analyses. This module describes how to use the main features of the Discovery Environment, using bioinformatics workflows for high-throughput sequence data as examples. © 2013 by John Wiley & Sons, Inc.

  11. Sex-specific associations between particulate matter exposure and gene expression in independent discovery and validation cohorts of middle-aged men and women

    DEFF Research Database (Denmark)

    Vrijens, Karen; Winckelmans, Ellen; Tsamou, Maria

    2017-01-01

    Background: Particulate matter (PM) exposure leads to premature death, mainly due to respiratory and cardiovascular diseases. Objectives: Identification of transcriptomic biomarkers of air pollution exposure and effect in a healthy adult population. Methods: Microarray analyses were performed in 98...... healthy volunteers (48 men, 50 women). The expression of eight sex-specific candidate biomarker genes (significantly associated with PM10 in the discovery cohort and with a reported link to air pollution-related disease) was measured with qPCR in an independent validation cohort (75 men, 94 women...

  12. Using the TIGR gene index databases for biological discovery.

    Science.gov (United States)

    Lee, Yuandan; Quackenbush, John

    2003-11-01

    The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.

  13. SNP discovery in the bovine milk transcriptome using RNA-Seq technology.

    Science.gov (United States)

    Cánovas, Angela; Rincon, Gonzalo; Islas-Trejo, Alma; Wickramasinghe, Saumya; Medrano, Juan F

    2010-12-01

    High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.

  14. Automated cell type discovery and classification through knowledge transfer

    Science.gov (United States)

    Lee, Hao-Chih; Kosoy, Roman; Becker, Christine E.

    2017-01-01

    Abstract Motivation: Recent advances in mass cytometry allow simultaneous measurements of up to 50 markers at single-cell resolution. However, the high dimensionality of mass cytometry data introduces computational challenges for automated data analysis and hinders translation of new biological understanding into clinical applications. Previous studies have applied machine learning to facilitate processing of mass cytometry data. However, manual inspection is still inevitable and becoming the barrier to reliable large-scale analysis. Results: We present a new algorithm called Automated Cell-type Discovery and Classification (ACDC) that fully automates the classification of canonical cell populations and highlights novel cell types in mass cytometry data. Evaluations on real-world data show ACDC provides accurate and reliable estimations compared to manual gating results. Additionally, ACDC automatically classifies previously ambiguous cell types to facilitate discovery. Our findings suggest that ACDC substantially improves both reliability and interpretability of results obtained from high-dimensional mass cytometry profiling data. Availability and Implementation: A Python package (Python 3) and analysis scripts for reproducing the results are availability on https://bitbucket.org/dudleylab/acdc. Contact: brian.kidd@mssm.edu or joel.dudley@mssm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28158442

  15. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar

    2004-01-01

    Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number of hu...

  16. A platform for rapid prototyping of synthetic gene networks in mammalian cells

    Science.gov (United States)

    Duportet, Xavier; Wroblewska, Liliana; Guye, Patrick; Li, Yinqing; Eyquem, Justin; Rieders, Julianne; Rimchala, Tharathorn; Batt, Gregory; Weiss, Ron

    2014-01-01

    Mammalian synthetic biology may provide novel therapeutic strategies, help decipher new paths for drug discovery and facilitate synthesis of valuable molecules. Yet, our capacity to genetically program cells is currently hampered by the lack of efficient approaches to streamline the design, construction and screening of synthetic gene networks. To address this problem, here we present a framework for modular and combinatorial assembly of functional (multi)gene expression vectors and their efficient and specific targeted integration into a well-defined chromosomal context in mammalian cells. We demonstrate the potential of this framework by assembling and integrating different functional mammalian regulatory networks including the largest gene circuit built and chromosomally integrated to date (6 transcription units, 27kb) encoding an inducible memory device. Using a library of 18 different circuits as a proof of concept, we also demonstrate that our method enables one-pot/single-flask chromosomal integration and screening of circuit libraries. This rapid and powerful prototyping platform is well suited for comparative studies of genetic regulatory elements, genes and multi-gene circuits as well as facile development of libraries of isogenic engineered cell lines. PMID:25378321

  17. Recent development of computational resources for new antibiotics discovery

    DEFF Research Database (Denmark)

    Kim, Hyun Uk; Blin, Kai; Lee, Sang Yup

    2017-01-01

    Understanding a complex working mechanism of biosynthetic gene clusters (BGCs) encoding secondary metabolites is a key to discovery of new antibiotics. Computational resources continue to be developed in order to better process increasing volumes of genome and chemistry data, and thereby better...

  18. Scientific Discovery in Deep Social Space: Sociology without Borders

    OpenAIRE

    Joseph Michalski

    2008-01-01

    Globalization affords an excellent opportunity to develop a genuinely universal, scientific sociology. In recent decades the politicization of the discipline has undermined the central mission of sociology: scientific discovery and explanation. The paper identifies several intellectual shifts that will facilitate the expansion and communication of such a science in an emerging global village of sociological analysts: 1) breaking with classical sociology to build upon innovative theoretical id...

  19. Mining the Proteome of subsp. ATCC 25586 for Potential Therapeutics Discovery: An Approach

    Directory of Open Access Journals (Sweden)

    Abdul Musaweer Habib

    2016-12-01

    Full Text Available The plethora of genome sequence information of bacteria in recent times has ushered in many novel strategies for antibacterial drug discovery and facilitated medical science to take up the challenge of the increasing resistance of pathogenic bacteria to current antibiotics. In this study, we adopted subtractive genomics approach to analyze the whole genome sequence of the Fusobacterium nucleatum, a human oral pathogen having association with colorectal cancer. Our study divulged 1,499 proteins of F. nucleatum, which have no homolog's in human genome. These proteins were subjected to screening further by using the Database of Essential Genes (DEG that resulted in the identification of 32 vitally important proteins for the bacterium. Subsequent analysis of the identified pivotal proteins, using the Kyoto Encyclopedia of Genes and Genomes (KEGG Automated Annotation Server (KAAS resulted in sorting 3 key enzymes of F. nucleatum that may be good candidates as potential drug targets, since they are unique for the bacterium and absent in humans. In addition, we have demonstrated the three dimensional structure of these three proteins. Finally, determination of ligand binding sites of the 2 key proteins as well as screening for functional inhibitors that best fitted with the ligands sites were conducted to discover effective novel therapeutic compounds against F. nucleatum.

  20. Horizontal acquisition of multiple mitochondrial genes from a parasitic plant followed by gene conversion with host mitochondrial genes

    Science.gov (United States)

    2010-01-01

    Background Horizontal gene transfer (HGT) is relatively common in plant mitochondrial genomes but the mechanisms, extent and consequences of transfer remain largely unknown. Previous results indicate that parasitic plants are often involved as either transfer donors or recipients, suggesting that direct contact between parasite and host facilitates genetic transfer among plants. Results In order to uncover the mechanistic details of plant-to-plant HGT, the extent and evolutionary fate of transfer was investigated between two groups: the parasitic genus Cuscuta and a small clade of Plantago species. A broad polymerase chain reaction (PCR) survey of mitochondrial genes revealed that at least three genes (atp1, atp6 and matR) were recently transferred from Cuscuta to Plantago. Quantitative PCR assays show that these three genes have a mitochondrial location in the one species line of Plantago examined. Patterns of sequence evolution suggest that these foreign genes degraded into pseudogenes shortly after transfer and reverse transcription (RT)-PCR analyses demonstrate that none are detectably transcribed. Three cases of gene conversion were detected between native and foreign copies of the atp1 gene. The identical phylogenetic distribution of the three foreign genes within Plantago and the retention of cytidines at ancestral positions of RNA editing indicate that these genes were probably acquired via a single, DNA-mediated transfer event. However, samplings of multiple individuals from two of the three species in the recipient Plantago clade revealed complex and perplexing phylogenetic discrepancies and patterns of sequence divergence for all three of the foreign genes. Conclusions This study reports the best evidence to date that multiple mitochondrial genes can be transferred via a single HGT event and that transfer occurred via a strictly DNA-level intermediate. The discovery of gene conversion between co-resident foreign and native mitochondrial copies suggests

  1. Horizontal acquisition of multiple mitochondrial genes from a parasitic plant followed by gene conversion with host mitochondrial genes

    Directory of Open Access Journals (Sweden)

    Hao Weilong

    2010-12-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT is relatively common in plant mitochondrial genomes but the mechanisms, extent and consequences of transfer remain largely unknown. Previous results indicate that parasitic plants are often involved as either transfer donors or recipients, suggesting that direct contact between parasite and host facilitates genetic transfer among plants. Results In order to uncover the mechanistic details of plant-to-plant HGT, the extent and evolutionary fate of transfer was investigated between two groups: the parasitic genus Cuscuta and a small clade of Plantago species. A broad polymerase chain reaction (PCR survey of mitochondrial genes revealed that at least three genes (atp1, atp6 and matR were recently transferred from Cuscuta to Plantago. Quantitative PCR assays show that these three genes have a mitochondrial location in the one species line of Plantago examined. Patterns of sequence evolution suggest that these foreign genes degraded into pseudogenes shortly after transfer and reverse transcription (RT-PCR analyses demonstrate that none are detectably transcribed. Three cases of gene conversion were detected between native and foreign copies of the atp1 gene. The identical phylogenetic distribution of the three foreign genes within Plantago and the retention of cytidines at ancestral positions of RNA editing indicate that these genes were probably acquired via a single, DNA-mediated transfer event. However, samplings of multiple individuals from two of the three species in the recipient Plantago clade revealed complex and perplexing phylogenetic discrepancies and patterns of sequence divergence for all three of the foreign genes. Conclusions This study reports the best evidence to date that multiple mitochondrial genes can be transferred via a single HGT event and that transfer occurred via a strictly DNA-level intermediate. The discovery of gene conversion between co-resident foreign and native

  2. Sea Level Rise Data Discovery

    Science.gov (United States)

    Quach, N.; Huang, T.; Boening, C.; Gill, K. M.

    2016-12-01

    Research related to sea level rise crosses multiple disciplines from sea ice to land hydrology. The NASA Sea Level Change Portal (SLCP) is a one-stop source for current sea level change information and data, including interactive tools for accessing and viewing regional data, a virtual dashboard of sea level indicators, and ongoing updates through a suite of editorial products that include content articles, graphics, videos, and animations. The architecture behind the SLCP makes it possible to integrate web content and data relevant to sea level change that are archived across various data centers as well as new data generated by sea level change principal investigators. The Extensible Data Gateway Environment (EDGE) is incorporated into the SLCP architecture to provide a unified platform for web content and science data discovery. EDGE is a data integration platform designed to facilitate high-performance geospatial data discovery and access with the ability to support multi-metadata standard specifications. EDGE has the capability to retrieve data from one or more sources and package the resulting sets into a single response to the requestor. With this unified endpoint, the Data Analysis Tool that is available on the SLCP can retrieve dataset and granule level metadata as well as perform geospatial search on the data. This talk focuses on the architecture that makes it possible to seamlessly integrate and enable discovery of disparate data relevant to sea level rise.

  3. A new approach to the rationale discovery of polymeric biomaterials

    Science.gov (United States)

    Kohn, Joachim; Welsh, William J.; Knight, Doyle

    2007-01-01

    This paper attempts to illustrate both the need for new approaches to biomaterials discovery as well as the significant promise inherent in the use of combinatorial and computational design strategies. The key observation of this Leading Opinion Paper is that the biomaterials community has been slow to embrace advanced biomaterials discovery tools such as combinatorial methods, high throughput experimentation, and computational modeling in spite of the significant promise shown by these discovery tools in materials science, medicinal chemistry and the pharmaceutical industry. It seems that the complexity of living cells and their interactions with biomaterials has been a conceptual as well as a practical barrier to the use of advanced discovery tools in biomaterials science. However, with the continued increase in computer power, the goal of predicting the biological response of cells in contact with biomaterials surfaces is within reach. Once combinatorial synthesis, high throughput experimentation, and computational modeling are integrated into the biomaterials discovery process, a significant acceleration is possible in the pace of development of improved medical implants, tissue regeneration scaffolds, and gene/drug delivery systems. PMID:17644176

  4. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    Directory of Open Access Journals (Sweden)

    Gasser Robin B

    2007-06-01

    Full Text Available Abstract Background Cholangiocarcinoma (CCA – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum than to free-living (Schmidtea mediterranea flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions

  5. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  6. Advanced biological and chemical discovery (ABCD): centralizing discovery knowledge in an inherently decentralized world.

    Science.gov (United States)

    Agrafiotis, Dimitris K; Alex, Simson; Dai, Heng; Derkinderen, An; Farnum, Michael; Gates, Peter; Izrailev, Sergei; Jaeger, Edward P; Konstant, Paul; Leung, Albert; Lobanov, Victor S; Marichal, Patrick; Martin, Douglas; Rassokhin, Dmitrii N; Shemanarev, Maxim; Skalkin, Andrew; Stong, John; Tabruyn, Tom; Vermeiren, Marleen; Wan, Jackson; Xu, Xiang Yang; Yao, Xiang

    2007-01-01

    We present ABCD, an integrated drug discovery informatics platform developed at Johnson & Johnson Pharmaceutical Research & Development, L.L.C. ABCD is an attempt to bridge multiple continents, data systems, and cultures using modern information technology and to provide scientists with tools that allow them to analyze multifactorial SAR and make informed, data-driven decisions. The system consists of three major components: (1) a data warehouse, which combines data from multiple chemical and pharmacological transactional databases, designed for supreme query performance; (2) a state-of-the-art application suite, which facilitates data upload, retrieval, mining, and reporting, and (3) a workspace, which facilitates collaboration and data sharing by allowing users to share queries, templates, results, and reports across project teams, campuses, and other organizational units. Chemical intelligence, performance, and analytical sophistication lie at the heart of the new system, which was developed entirely in-house. ABCD is used routinely by more than 1000 scientists around the world and is rapidly expanding into other functional areas within the J&J organization.

  7. Discovery and development of new antibacterial drugs: learning from experience?

    Science.gov (United States)

    Jackson, Nicole; Czaplewski, Lloyd; Piddock, Laura J V

    2018-06-01

    Antibiotic (antibacterial) resistance is a serious global problem and the need for new treatments is urgent. The current antibiotic discovery model is not delivering new agents at a rate that is sufficient to combat present levels of antibiotic resistance. This has led to fears of the arrival of a 'post-antibiotic era'. Scientific difficulties, an unfavourable regulatory climate, multiple company mergers and the low financial returns associated with antibiotic drug development have led to the withdrawal of many pharmaceutical companies from the field. The regulatory climate has now begun to improve, but major scientific hurdles still impede the discovery and development of novel antibacterial agents. To facilitate discovery activities there must be increased understanding of the scientific problems experienced by pharmaceutical companies. This must be coupled with addressing the current antibiotic resistance crisis so that compounds and ultimately drugs are delivered to treat the most urgent clinical challenges. By understanding the causes of the failures and successes of the pharmaceutical industry's research history, duplication of discovery programmes will be reduced, increasing the productivity of the antibiotic drug discovery pipeline by academia and small companies. The most important scientific issues to address are getting molecules into the Gram-negative bacterial cell and avoiding their efflux. Hence screening programmes should focus their efforts on whole bacterial cells rather than cell-free systems. Despite falling out of favour with pharmaceutical companies, natural product research still holds promise for providing new molecules as a basis for discovery.

  8. Service Demand Discovery Mechanism for Mobile Social Networks.

    Science.gov (United States)

    Wu, Dapeng; Yan, Junjie; Wang, Honggang; Wang, Ruyan

    2016-11-23

    In the last few years, the service demand for wireless data over mobile networks has continually been soaring at a rapid pace. Thereinto, in Mobile Social Networks (MSNs), users can discover adjacent users for establishing temporary local connection and thus sharing already downloaded contents with each other to offload the service demand. Due to the partitioned topology, intermittent connection and social feature in such a network, the service demand discovery is challenging. In particular, the service demand discovery is exploited to identify the best relay user through the service registration, service selection and service activation. In order to maximize the utilization of limited network resources, a hybrid service demand discovery architecture, such as a Virtual Dictionary User (VDU) is proposed in this paper. Based on the historical data of movement, users can discover their relationships with others. Subsequently, according to the users activity, VDU is selected to facilitate the service registration procedure. Further, the service information outside of a home community can be obtained through the Global Active User (GAU) to support the service selection. To provide the Quality of Service (QoS), the Service Providing User (SPU) is chosen among multiple candidates. Numerical results show that, when compared with other classical service algorithms, the proposed scheme can improve the successful service demand discovery ratio by 25% under reduced overheads.

  9. Recent development in software and automation tools for high-throughput discovery bioanalysis.

    Science.gov (United States)

    Shou, Wilson Z; Zhang, Jun

    2012-05-01

    Bioanalysis with LC-MS/MS has been established as the method of choice for quantitative determination of drug candidates in biological matrices in drug discovery and development. The LC-MS/MS bioanalytical support for drug discovery, especially for early discovery, often requires high-throughput (HT) analysis of large numbers of samples (hundreds to thousands per day) generated from many structurally diverse compounds (tens to hundreds per day) with a very quick turnaround time, in order to provide important activity and liability data to move discovery projects forward. Another important consideration for discovery bioanalysis is its fit-for-purpose quality requirement depending on the particular experiments being conducted at this stage, and it is usually not as stringent as those required in bioanalysis supporting drug development. These aforementioned attributes of HT discovery bioanalysis made it an ideal candidate for using software and automation tools to eliminate manual steps, remove bottlenecks, improve efficiency and reduce turnaround time while maintaining adequate quality. In this article we will review various recent developments that facilitate automation of individual bioanalytical procedures, such as sample preparation, MS/MS method development, sample analysis and data review, as well as fully integrated software tools that manage the entire bioanalytical workflow in HT discovery bioanalysis. In addition, software tools supporting the emerging high-resolution accurate MS bioanalytical approach are also discussed.

  10. Minipig and beagle animal model genomes aid species selection in pharmaceutical discovery and development

    Energy Technology Data Exchange (ETDEWEB)

    Vamathevan, Jessica J., E-mail: jessica.j.vamathevan@gsk.com [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom); Hall, Matthew D.; Hasan, Samiul; Woollard, Peter M. [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom); Xu, Meng; Yang, Yulan; Li, Xin; Wang, Xiaoli [BGI-Shenzen, Shenzhen (China); Kenny, Steve [Safety Assessment, PTS, GlaxoSmithKline, Ware (United Kingdom); Brown, James R. [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Collegeville, PA (United States); Huxley-Jones, Julie [UK Platform Technology Sciences (PTS) Operations and Planning, PTS, GlaxoSmithKline, Stevenage (United Kingdom); Lyon, Jon; Haselden, John [Safety Assessment, PTS, GlaxoSmithKline, Ware (United Kingdom); Min, Jiumeng [BGI-Shenzen, Shenzhen (China); Sanseau, Philippe [Computational Biology, Quantitative Sciences, GlaxoSmithKline, Stevenage (United Kingdom)

    2013-07-15

    Improving drug attrition remains a challenge in pharmaceutical discovery and development. A major cause of early attrition is the demonstration of safety signals which can negate any therapeutic index previously established. Safety attrition needs to be put in context of clinical translation (i.e. human relevance) and is negatively impacted by differences between animal models and human. In order to minimize such an impact, an earlier assessment of pharmacological target homology across animal model species will enhance understanding of the context of animal safety signals and aid species selection during later regulatory toxicology studies. Here we sequenced the genomes of the Sus scrofa Göttingen minipig and the Canis familiaris beagle, two widely used animal species in regulatory safety studies. Comparative analyses of these new genomes with other key model organisms, namely mouse, rat, cynomolgus macaque, rhesus macaque, two related breeds (S. scrofa Duroc and C. familiaris boxer) and human reveal considerable variation in gene content. Key genes in toxicology and metabolism studies, such as the UGT2 family, CYP2D6, and SLCO1A2, displayed unique duplication patterns. Comparisons of 317 known human drug targets revealed surprising variation such as species-specific positive selection, duplication and higher occurrences of pseudogenized targets in beagle (41 genes) relative to minipig (19 genes). These data will facilitate the more effective use of animals in biomedical research. - Highlights: • Genomes of the minipig and beagle dog, two species used in pharmaceutical studies. • First systematic comparative genome analysis of human and six experimental animals. • Key drug toxicology genes display unique duplication patterns across species. • Comparison of 317 drug targets show species-specific evolutionary patterns.

  11. Minipig and beagle animal model genomes aid species selection in pharmaceutical discovery and development

    International Nuclear Information System (INIS)

    Vamathevan, Jessica J.; Hall, Matthew D.; Hasan, Samiul; Woollard, Peter M.; Xu, Meng; Yang, Yulan; Li, Xin; Wang, Xiaoli; Kenny, Steve; Brown, James R.; Huxley-Jones, Julie; Lyon, Jon; Haselden, John; Min, Jiumeng; Sanseau, Philippe

    2013-01-01

    Improving drug attrition remains a challenge in pharmaceutical discovery and development. A major cause of early attrition is the demonstration of safety signals which can negate any therapeutic index previously established. Safety attrition needs to be put in context of clinical translation (i.e. human relevance) and is negatively impacted by differences between animal models and human. In order to minimize such an impact, an earlier assessment of pharmacological target homology across animal model species will enhance understanding of the context of animal safety signals and aid species selection during later regulatory toxicology studies. Here we sequenced the genomes of the Sus scrofa Göttingen minipig and the Canis familiaris beagle, two widely used animal species in regulatory safety studies. Comparative analyses of these new genomes with other key model organisms, namely mouse, rat, cynomolgus macaque, rhesus macaque, two related breeds (S. scrofa Duroc and C. familiaris boxer) and human reveal considerable variation in gene content. Key genes in toxicology and metabolism studies, such as the UGT2 family, CYP2D6, and SLCO1A2, displayed unique duplication patterns. Comparisons of 317 known human drug targets revealed surprising variation such as species-specific positive selection, duplication and higher occurrences of pseudogenized targets in beagle (41 genes) relative to minipig (19 genes). These data will facilitate the more effective use of animals in biomedical research. - Highlights: • Genomes of the minipig and beagle dog, two species used in pharmaceutical studies. • First systematic comparative genome analysis of human and six experimental animals. • Key drug toxicology genes display unique duplication patterns across species. • Comparison of 317 drug targets show species-specific evolutionary patterns

  12. miRvestigator: web application to identify miRNAs responsible for co-regulated gene expression patterns discovered through transcriptome profiling.

    Science.gov (United States)

    Plaisier, Christopher L; Bare, J Christopher; Baliga, Nitin S

    2011-07-01

    Transcriptome profiling studies have produced staggering numbers of gene co-expression signatures for a variety of biological systems. A significant fraction of these signatures will be partially or fully explained by miRNA-mediated targeted transcript degradation. miRvestigator takes as input lists of co-expressed genes from Caenorhabditis elegans, Drosophila melanogaster, G. gallus, Homo sapiens, Mus musculus or Rattus norvegicus and identifies the specific miRNAs that are likely to bind to 3' un-translated region (UTR) sequences to mediate the observed co-regulation. The novelty of our approach is the miRvestigator hidden Markov model (HMM) algorithm which systematically computes a similarity P-value for each unique miRNA seed sequence from the miRNA database miRBase to an overrepresented sequence motif identified within the 3'-UTR of the query genes. We have made this miRNA discovery tool accessible to the community by integrating our HMM algorithm with a proven algorithm for de novo discovery of miRNA seed sequences and wrapping these algorithms into a user-friendly interface. Additionally, the miRvestigator web server also produces a list of putative miRNA binding sites within 3'-UTRs of the query transcripts to facilitate the design of validation experiments. The miRvestigator is freely available at http://mirvestigator.systemsbiology.net.

  13. An online conserved SSR discovery through cross-species comparison

    Directory of Open Access Journals (Sweden)

    Tun-Wen Pai

    2009-02-01

    Full Text Available Tun-Wen Pai1, Chien-Ming Chen1, Meng-Chang Hsiao1, Ronshan Cheng2, Wen-Shyong Tzou3, Chin-Hua Hu31Department of Computer Science and Engineering; 2Department of Aquaculture, 3Institute of Bioscience and Biotechnology, National Taiwan Ocean University, Keelung, Taiwan, Republic of ChinaAbstract: Simple sequence repeats (SSRs play important roles in gene regulation and genome evolution. Although there exist several online resources for SSR mining, most of them only extract general SSR patterns without providing functional information. Here, an online search tool, CG-SSR (Comparative Genomics SSR discovery, has been developed for discovering potential functional SSRs from vertebrate genomes through cross-species comparison. In addition to revealing SSR candidates in conserved regions among various species, it also combines accurate coordinate and functional genomics information. CG-SSR is the first comprehensive and efficient online tool for conserved SSR discovery.Keywords: microsatellites, genome, comparative genomics, functional SSR, gene ontology, conserved region

  14. De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

    Science.gov (United States)

    Keilwagen, Jens; Grau, Jan; Paponov, Ivan A; Posch, Stefan; Strickert, Marc; Grosse, Ivo

    2011-02-10

    Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open

  15. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  16. Bayesian centroid estimation for motif discovery.

    Science.gov (United States)

    Carvalho, Luis

    2013-01-01

    Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.

  17. Bayesian centroid estimation for motif discovery.

    Directory of Open Access Journals (Sweden)

    Luis Carvalho

    Full Text Available Biological sequences may contain patterns that signal important biomolecular functions; a classical example is regulation of gene expression by transcription factors that bind to specific patterns in genomic promoter regions. In motif discovery we are given a set of sequences that share a common motif and aim to identify not only the motif composition, but also the binding sites in each sequence of the set. We propose a new centroid estimator that arises from a refined and meaningful loss function for binding site inference. We discuss the main advantages of centroid estimation for motif discovery, including computational convenience, and how its principled derivation offers further insights about the posterior distribution of binding site configurations. We also illustrate, using simulated and real datasets, that the centroid estimator can differ from the traditional maximum a posteriori or maximum likelihood estimators.

  18. Venomics-Accelerated Cone Snail Venom Peptide Discovery

    Science.gov (United States)

    Himaya, S. W. A.

    2018-01-01

    Cone snail venoms are considered a treasure trove of bioactive peptides. Despite over 800 species of cone snails being known, each producing over 1000 venom peptides, only about 150 unique venom peptides are structurally and functionally characterized. To overcome the limitations of the traditional low-throughput bio-discovery approaches, multi-omics systems approaches have been introduced to accelerate venom peptide discovery and characterisation. This “venomic” approach is starting to unravel the full complexity of cone snail venoms and to provide new insights into their biology and evolution. The main challenge for venomics is the effective integration of transcriptomics, proteomics, and pharmacological data and the efficient analysis of big datasets. Novel database search tools and visualisation techniques are now being introduced that facilitate data exploration, with ongoing advances in related omics fields being expected to further enhance venomics studies. Despite these challenges and future opportunities, cone snail venomics has already exponentially expanded the number of novel venom peptide sequences identified from the species investigated, although most novel conotoxins remain to be pharmacologically characterised. Therefore, efficient high-throughput peptide production systems and/or banks of miniaturized discovery assays are required to overcome this bottleneck and thus enhance cone snail venom bioprospecting and accelerate the identification of novel drug leads. PMID:29522462

  19. Systems Pharmacology in Small Molecular Drug Discovery

    Directory of Open Access Journals (Sweden)

    Wei Zhou

    2016-02-01

    Full Text Available Drug discovery is a risky, costly and time-consuming process depending on multidisciplinary methods to create safe and effective medicines. Although considerable progress has been made by high-throughput screening methods in drug design, the cost of developing contemporary approved drugs did not match that in the past decade. The major reason is the late-stage clinical failures in Phases II and III because of the complicated interactions between drug-specific, human body and environmental aspects affecting the safety and efficacy of a drug. There is a growing hope that systems-level consideration may provide a new perspective to overcome such current difficulties of drug discovery and development. The systems pharmacology method emerged as a holistic approach and has attracted more and more attention recently. The applications of systems pharmacology not only provide the pharmacodynamic evaluation and target identification of drug molecules, but also give a systems-level of understanding the interaction mechanism between drugs and complex disease. Therefore, the present review is an attempt to introduce how holistic systems pharmacology that integrated in silico ADME/T (i.e., absorption, distribution, metabolism, excretion and toxicity, target fishing and network pharmacology facilitates the discovery of small molecular drugs at the system level.

  20. Venomics-Accelerated Cone Snail Venom Peptide Discovery

    Directory of Open Access Journals (Sweden)

    S. W. A. Himaya

    2018-03-01

    Full Text Available Cone snail venoms are considered a treasure trove of bioactive peptides. Despite over 800 species of cone snails being known, each producing over 1000 venom peptides, only about 150 unique venom peptides are structurally and functionally characterized. To overcome the limitations of the traditional low-throughput bio-discovery approaches, multi-omics systems approaches have been introduced to accelerate venom peptide discovery and characterisation. This “venomic” approach is starting to unravel the full complexity of cone snail venoms and to provide new insights into their biology and evolution. The main challenge for venomics is the effective integration of transcriptomics, proteomics, and pharmacological data and the efficient analysis of big datasets. Novel database search tools and visualisation techniques are now being introduced that facilitate data exploration, with ongoing advances in related omics fields being expected to further enhance venomics studies. Despite these challenges and future opportunities, cone snail venomics has already exponentially expanded the number of novel venom peptide sequences identified from the species investigated, although most novel conotoxins remain to be pharmacologically characterised. Therefore, efficient high-throughput peptide production systems and/or banks of miniaturized discovery assays are required to overcome this bottleneck and thus enhance cone snail venom bioprospecting and accelerate the identification of novel drug leads.

  1. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  2. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Science.gov (United States)

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an

  3. Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.

    Science.gov (United States)

    Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo

    Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object

  4. Gene discovery and molecular marker development, based on high-throughput transcript sequencing of Paspalum dilatatum Poir.

    Directory of Open Access Journals (Sweden)

    Andrea Giordano

    Full Text Available BACKGROUND: Paspalum dilatatum Poir. (common name dallisgrass is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. RESULTS: Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs. A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. CONCLUSIONS: The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression

  5. A constrained polynomial regression procedure for estimating the local False Discovery Rate

    Directory of Open Access Journals (Sweden)

    Broët Philippe

    2007-06-01

    Full Text Available Abstract Background In the context of genomic association studies, for which a large number of statistical tests are performed simultaneously, the local False Discovery Rate (lFDR, which quantifies the evidence of a specific gene association with a clinical or biological variable of interest, is a relevant criterion for taking into account the multiple testing problem. The lFDR not only allows an inference to be made for each gene through its specific value, but also an estimate of Benjamini-Hochberg's False Discovery Rate (FDR for subsets of genes. Results In the framework of estimating procedures without any distributional assumption under the alternative hypothesis, a new and efficient procedure for estimating the lFDR is described. The results of a simulation study indicated good performances for the proposed estimator in comparison to four published ones. The five different procedures were applied to real datasets. Conclusion A novel and efficient procedure for estimating lFDR was developed and evaluated.

  6. Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

    Science.gov (United States)

    Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

    2015-01-01

    In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.

  7. Integrative subtype discovery in glioblastoma using iCluster.

    Directory of Open Access Journals (Sweden)

    Ronglai Shen

    Full Text Available Large-scale cancer genome projects, such as the Cancer Genome Atlas (TCGA project, are comprehensive molecular characterization efforts to accelerate our understanding of cancer biology and the discovery of new therapeutic targets. The accumulating wealth of multidimensional data provides a new paradigm for important research problems including cancer subtype discovery. The current standard approach relies on separate clustering analyses followed by manual integration. Results can be highly data type dependent, restricting the ability to discover new insights from multidimensional data. In this study, we present an integrative subtype analysis of the TCGA glioblastoma (GBM data set. Our analysis revealed new insights through integrated subtype characterization. We found three distinct integrated tumor subtypes. Subtype 1 lacks the classical GBM events of chr 7 gain and chr 10 loss. This subclass is enriched for the G-CIMP phenotype and shows hypermethylation of genes involved in brain development and neuronal differentiation. The tumors in this subclass display a Proneural expression profile. Subtype 2 is characterized by a near complete association with EGFR amplification, overrepresentation of promoter methylation of homeobox and G-protein signaling genes, and a Classical expression profile. Subtype 3 is characterized by NF1 and PTEN alterations and exhibits a Mesenchymal-like expression profile. The data analysis workflow we propose provides a unified and computationally scalable framework to harness the full potential of large-scale integrated cancer genomic data for integrative subtype discovery.

  8. User needs analysis and usability assessment of DataMed - a biomedical data discovery index.

    Science.gov (United States)

    Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R

    2017-11-30

    To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. A high-density transcript linkage map with 1,845 expressed genes positioned by microarray-based Single Feature Polymorphisms (SFP) in Eucalyptus

    Science.gov (United States)

    2011-01-01

    Background Technological advances are progressively increasing the application of genomics to a wider array of economically and ecologically important species. High-density maps enriched for transcribed genes facilitate the discovery of connections between genes and phenotypes. We report the construction of a high-density linkage map of expressed genes for the heterozygous genome of Eucalyptus using Single Feature Polymorphism (SFP) markers. Results SFP discovery and mapping was achieved using pseudo-testcross screening and selective mapping to simultaneously optimize linkage mapping and microarray costs. SFP genotyping was carried out by hybridizing complementary RNA prepared from 4.5 year-old trees xylem to an SFP array containing 103,000 25-mer oligonucleotide probes representing 20,726 unigenes derived from a modest size expressed sequence tags collection. An SFP-mapping microarray with 43,777 selected candidate SFP probes representing 15,698 genes was subsequently designed and used to genotype SFPs in a larger subset of the segregating population drawn by selective mapping. A total of 1,845 genes were mapped, with 884 of them ordered with high likelihood support on a framework map anchored to 180 microsatellites with average density of 1.2 cM. Using more probes per unigene increased by two-fold the likelihood of detecting segregating SFPs eventually resulting in more genes mapped. In silico validation showed that 87% of the SFPs map to the expected location on the 4.5X draft sequence of the Eucalyptus grandis genome. Conclusions The Eucalyptus 1,845 gene map is the most highly enriched map for transcriptional information for any forest tree species to date. It represents a major improvement on the number of genes previously positioned on Eucalyptus maps and provides an initial glimpse at the gene space for this global tree genome. A general protocol is proposed to build high-density transcript linkage maps in less characterized plant species by SFP genotyping

  10. Performance Evaluation of a Cluster-Based Service Discovery Protocol for Heterogeneous Wireless Sensor Networks

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Scholten, Johan; Havinga, Paul J.M.; Hartel, Pieter H.

    2006-01-01

    Abstract—This paper evaluates the performance in terms of resource consumption of a service discovery protocol proposed for heterogeneous Wireless Sensor Networks (WSNs). The protocol is based on a clustering structure, which facilitates the construction of a distributed directory. Nodes with higher

  11. Systems Biology Modeling of the Radiation Sensitivity Network: A Biomarker Discovery Platform

    International Nuclear Information System (INIS)

    Eschrich, Steven; Zhang Hongling; Zhao Haiyan; Boulware, David; Lee, Ji-Hyun; Bloom, Gregory; Torres-Roca, Javier F.

    2009-01-01

    Purpose: The discovery of effective biomarkers is a fundamental goal of molecular medicine. Developing a systems-biology understanding of radiosensitivity can enhance our ability of identifying radiation-specific biomarkers. Methods and Materials: Radiosensitivity, as represented by the survival fraction at 2 Gy was modeled in 48 human cancer cell lines. We applied a linear regression algorithm that integrates gene expression with biological variables, including ras status (mut/wt), tissue of origin and p53 status (mut/wt). Results: The biomarker discovery platform is a network representation of the top 500 genes identified by linear regression analysis. This network was reduced to a 10-hub network that includes c-Jun, HDAC1, RELA (p65 subunit of NFKB), PKC-beta, SUMO-1, c-Abl, STAT1, AR, CDK1, and IRF1. Nine targets associated with radiosensitization drugs are linked to the network, demonstrating clinical relevance. Furthermore, the model identified four significant radiosensitivity clusters of terms and genes. Ras was a dominant variable in the analysis, as was the tissue of origin, and their interaction with gene expression but not p53. Overrepresented biological pathways differed between clusters but included DNA repair, cell cycle, apoptosis, and metabolism. The c-Jun network hub was validated using a knockdown approach in 8 human cell lines representing lung, colon, and breast cancers. Conclusion: We have developed a novel radiation-biomarker discovery platform using a systems biology modeling approach. We believe this platform will play a central role in the integration of biology into clinical radiation oncology practice.

  12. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    The price discovery literature investigates how homogenous securities traded on different markets incorporate information into prices. We take this literature one step further and investigate how these markets contribute to stochastic volatility (volatility discovery). We formally show...... that the realized measures from homogenous securities share a fractional stochastic trend, which is a combination of the price and volatility discovery measures. Furthermore, we show that volatility discovery is associated with the way that market participants process information arrival (market sensitivity......). Finally, we compute volatility discovery for 30 actively traded stocks in the U.S. and report that Nyse and Arca dominate Nasdaq....

  13. Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems

    Directory of Open Access Journals (Sweden)

    Chung I-Fang

    2008-10-01

    Full Text Available Abstract Background The Signal-to-Noise-Ratio (SNR is often used for identification of biomarkers for two-class problems and no formal and useful generalization of SNR is available for multiclass problems. We propose innovative generalizations of SNR for multiclass cancer discrimination through introduction of two indices, Gene Dominant Index and Gene Dormant Index (GDIs. These two indices lead to the concepts of dominant and dormant genes with biological significance. We use these indices to develop methodologies for discovery of dominant and dormant biomarkers with interesting biological significance. The dominancy and dormancy of the identified biomarkers and their excellent discriminating power are also demonstrated pictorially using the scatterplot of individual gene and 2-D Sammon's projection of the selected set of genes. Using information from the literature we have shown that the GDI based method can identify dominant and dormant genes that play significant roles in cancer biology. These biomarkers are also used to design diagnostic prediction systems. Results and discussion To evaluate the effectiveness of the GDIs, we have used four multiclass cancer data sets (Small Round Blue Cell Tumors, Leukemia, Central Nervous System Tumors, and Lung Cancer. For each data set we demonstrate that the new indices can find biologically meaningful genes that can act as biomarkers. We then use six machine learning tools, Nearest Neighbor Classifier (NNC, Nearest Mean Classifier (NMC, Support Vector Machine (SVM classifier with linear kernel, and SVM classifier with Gaussian kernel, where both SVMs are used in conjunction with one-vs-all (OVA and one-vs-one (OVO strategies. We found GDIs to be very effective in identifying biomarkers with strong class specific signatures. With all six tools and for all data sets we could achieve better or comparable prediction accuracies usually with fewer marker genes than results reported in the literature using the

  14. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  15. Predictive Power Estimation Algorithm (PPEA--a new algorithm to reduce overfitting for genomic biomarker discovery.

    Directory of Open Access Journals (Sweden)

    Jiangang Liu

    Full Text Available Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA, which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1 PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2 the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3 using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4 more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses.

  16. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper.

    Science.gov (United States)

    Manivannan, Abinaya; Kim, Jin-Hee; Yang, Eun-Young; Ahn, Yul-Kyun; Lee, Eun-Su; Choi, Sena; Kim, Do-Sun

    2018-01-01

    Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS) approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP) indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  17. Next-Generation Sequencing Approaches in Genome-Wide Discovery of Single Nucleotide Polymorphism Markers Associated with Pungency and Disease Resistance in Pepper

    Directory of Open Access Journals (Sweden)

    Abinaya Manivannan

    2018-01-01

    Full Text Available Pepper is an economically important horticultural plant that has been widely used for its pungency and spicy taste in worldwide cuisines. Therefore, the domestication of pepper has been carried out since antiquity. Owing to meet the growing demand for pepper with high quality, organoleptic property, nutraceutical contents, and disease tolerance, genomics assisted breeding techniques can be incorporated to develop novel pepper varieties with desired traits. The application of next-generation sequencing (NGS approaches has reformed the plant breeding technology especially in the area of molecular marker assisted breeding. The availability of genomic information aids in the deeper understanding of several molecular mechanisms behind the vital physiological processes. In addition, the NGS methods facilitate the genome-wide discovery of DNA based markers linked to key genes involved in important biological phenomenon. Among the molecular markers, single nucleotide polymorphism (SNP indulges various benefits in comparison with other existing DNA based markers. The present review concentrates on the impact of NGS approaches in the discovery of useful SNP markers associated with pungency and disease resistance in pepper. The information provided in the current endeavor can be utilized for the betterment of pepper breeding in future.

  18. Development of a quantitative competitive reverse transcriptase polymerase chain reaction for the quantification of growth hormone gene expression in pigs

    Directory of Open Access Journals (Sweden)

    Maurício Machaim Franco

    2003-01-01

    Full Text Available After the advent of the genome projects, followed by the discovery of DNA polymorphisms, basic understanding of gene expression is the next focus to explain the association between polymorphisms and the level of gene expression, as well as to demonstrate the interaction among genes. Among the various techniques for the investigation of transcriptional profiling involving patterns of gene expression, quantitative PCR is the simplest analytical laboratory technique. The objective of this work was to analyze two strategies of a competitive PCR technique for the quantification of the pig growth hormone (GH gene expression. A pair of primers was designed targeting exons 3 and 5, and two competitive PCR strategies were performed, one utilizing a specific amplicon as a competitor, and the other utilizing a low-stringency PCR amplicon as a competitor. The latter strategy proved to be easier and more efficient, offering an accessible tool that can be used in any kind of competitive reaction, facilitating the study of gene expression patterns for both genetics and diagnostics of infectious diseases.

  19. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Science.gov (United States)

    2011-01-01

    Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt), we have generated Expressed Sequence Tags (ESTs) by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores) and asexual (germinated urediniospores) stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum), 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs). Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt) and stripe rust, P. striiformis f. sp. tritici (Pst), and poplar

  20. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Directory of Open Access Journals (Sweden)

    Wynhoven Brian

    2011-03-01

    Full Text Available Abstract Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt, we have generated Expressed Sequence Tags (ESTs by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores and asexual (germinated urediniospores stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum, 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs. Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt and stripe rust, P. striiformis f. sp

  1. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  2. DDMGD: the database of text-mined associations between genes methylated in diseases from different species

    KAUST Repository

    Raies, A. B.

    2014-11-14

    Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD\\'s scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases.

  3. Can biochemistry drive drug discovery beyond simple potency measurements?

    Science.gov (United States)

    Chène, Patrick

    2012-04-01

    Among the fields of expertise required to develop drugs successfully, biochemistry holds a key position in drug discovery at the interface between chemistry, structural biology and cell biology. However, taking the example of protein kinases, it appears that biochemical assays are mostly used in the pharmaceutical industry to measure compound potency and/or selectivity. This limited use of biochemistry is surprising, given that detailed biochemical analyses are commonly used in academia to unravel molecular recognition processes. In this article, I show that biochemistry can provide invaluable information on the dynamics and energetics of compound-target interactions that cannot be obtained on the basis of potency measurements and structural data. Therefore, an extensive use of biochemistry in drug discovery could facilitate the identification and/or development of new drugs. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Schizophrenia genomics and proteomics: are we any closer to biomarker discovery?

    Directory of Open Access Journals (Sweden)

    Kramer Alon

    2009-01-01

    Full Text Available Abstract The field of proteomics has made leaps and bounds in the last 10 years particularly in the fields of oncology and cardiovascular medicine. In comparison, neuroproteomics is still playing catch up mainly due to the relative complexity of neurological disorders. Schizophrenia is one such disorder, believed to be the results of multiple factors both genetic and environmental. Affecting over 2 million people in the US alone, it has become a major clinical and public health concern worldwide. This paper gives an update of schizophrenia biomarker research as reviewed by Lakhan in 2006 and gives us a rundown of the progress made during the last two years. Several studies demonstrate the potential of cerebrospinal fluid as a source of neuro-specific biomarkers. Genetic association studies are making headway in identifying candidate genes for schizophrenia. In addition, metabonomics, bioinformatics, and neuroimaging techniques are aiming to complete the picture by filling in knowledge gaps. International cooperation in the form of genomics and protein databases and brain banks is facilitating research efforts. While none of the recent developments described here in qualifies as biomarker discovery, many are likely to be stepping stones towards that goal.

  5. The web server of IBM's Bioinformatics and Pattern Discovery group.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  6. What Neural Substrates Trigger the Adept Scientific Pattern Discovery by Biologists?

    Science.gov (United States)

    Lee, Jun-Ki; Kwon, Yong-Ju

    2011-04-01

    This study investigated the neural correlates of experts and novices during biological object pattern detection using an fMRI approach in order to reveal the neural correlates of a biologist's superior pattern discovery ability. Sixteen healthy male participants (8 biologists and 8 non-biologists) volunteered for the study. Participants were shown fifteen series of organism pictures and asked to detect patterns amid stimulus pictures. Primary findings showed significant activations in the right middle temporal gyrus and inferior parietal lobule amongst participants in the biologist (expert) group. Interestingly, the left superior temporal gyrus was activated in participants from the non-biologist (novice) group. These results suggested that superior pattern discovery ability could be related to a functional facilitation of the parieto-temporal network, which is particularly driven by the right middle temporal gyrus and inferior parietal lobule in addition to the recruitment of additional brain regions. Furthermore, the functional facilitation of the network might actually pertain to high coherent processing skills and visual working memory capacity. Hence, study results suggested that adept scientific thinking ability can be detected by neuronal substrates, which may be used as criteria for developing and evaluating a brain-based science curriculum and test instrument.

  7. ncISO Facilitating Metadata and Scientific Data Discovery

    Science.gov (United States)

    Neufeld, D.; Habermann, T.

    2011-12-01

    Increasing the usability and availability climate and oceanographic datasets for environmental research requires improved metadata and tools to rapidly locate and access relevant information for an area of interest. Because of the distributed nature of most environmental geospatial data, a common approach is to use catalog services that support queries on metadata harvested from remote map and data services. A key component to effectively using these catalog services is the availability of high quality metadata associated with the underlying data sets. In this presentation, we examine the use of ncISO, and Geoportal as open source tools that can be used to document and facilitate access to ocean and climate data available from Thematic Realtime Environmental Distributed Data Services (THREDDS) data services. Many atmospheric and oceanographic spatial data sets are stored in the Network Common Data Format (netCDF) and served through the Unidata THREDDS Data Server (TDS). NetCDF and THREDDS are becoming increasingly accepted in both the scientific and geographic research communities as demonstrated by the recent adoption of netCDF as an Open Geospatial Consortium (OGC) standard. One important source for ocean and atmospheric based data sets is NOAA's Unified Access Framework (UAF) which serves over 3000 gridded data sets from across NOAA and NOAA-affiliated partners. Due to the large number of datasets, browsing the data holdings to locate data is impractical. Working with Unidata, we have created a new service for the TDS called "ncISO", which allows automatic generation of ISO 19115-2 metadata from attributes and variables in TDS datasets. The ncISO metadata records can be harvested by catalog services such as ESSI-labs GI-Cat catalog service, and ESRI's Geoportal which supports query through a number of services, including OpenSearch and Catalog Services for the Web (CSW). ESRI's Geoportal Server provides a number of user friendly search capabilities for end users

  8. Epithelial-Mesenchymal Transition (EMT) Gene Variants and Epithelial Ovarian Cancer (EOC) Risk.

    Science.gov (United States)

    Amankwah, Ernest K; Lin, Hui-Yi; Tyrer, Jonathan P; Lawrenson, Kate; Dennis, Joe; Chornokur, Ganna; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Bruinsma, Fiona; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Bisogna, Maria; Bjorge, Line; Bogdanova, Natalia; Brinton, Louise A; Brooks-Wilson, Angela; Bunker, Clareann H; Butzow, Ralf; Campbell, Ian G; Carty, Karen; Chen, Zhihua; Chen, Y Ann; Chang-Claude, Jenny; Cook, Linda S; Cramer, Daniel W; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; du Bois, Andreas; Despierre, Evelyn; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; Dürst, Matthias; Easton, Douglas F; Eccles, Diana M; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goodman, Marc T; Gronwald, Jacek; Harrington, Patricia; Harter, Philipp; Hasmad, Hanis N; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Claus K; Hogdall, Estrid; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Jim, Heather; Kellar, Melissa; Kiemeney, Lambertus A; Krakstad, Camilla; Kjaer, Susanne K; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lim, Boon Kiong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon F A G; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Ian; Menon, Usha; Milne, Roger L; Modugno, Francesmary; Moysich, Kirsten B; Ness, Roberta B; Nevanlinna, Heli; Eilber, Ursula; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Paul, James; Pearce, Celeste L; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Pike, Malcolm C; Poole, Elizabeth M; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schernhammer, Eva; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Spiewankiewicz, Beata; Sucheston-Campbell, Lara; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Thomsen, Lotte; Tangen, Ingvild L; Tworoger, Shelley S; van Altena, Anne M; Vierkant, Robert A; Vergote, Ignace; Walsh, Christine S; Wang-Gohrke, Shan; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Wu, Anna H; Wu, Xifeng; Woo, Yin-Ling; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Kelemen, Linda E; Berchuck, Andrew; Schildkraut, Joellen M; Ramus, Susan J; Goode, Ellen L; Monteiro, Alvaro N A; Gayther, Simon A; Narod, Steven A; Pharoah, Paul D P; Sellers, Thomas A; Phelan, Catherine M

    2015-12-01

    Epithelial-mesenchymal transition (EMT) is a process whereby epithelial cells assume mesenchymal characteristics to facilitate cancer metastasis. However, EMT also contributes to the initiation and development of primary tumors. Prior studies that explored the hypothesis that EMT gene variants contribute to epithelial ovarian carcinoma (EOC) risk have been based on small sample sizes and none have sought replication in an independent population. We screened 15,816 single-nucleotide polymorphisms (SNPs) in 296 genes in a discovery phase using data from a genome-wide association study of EOC among women of European ancestry (1,947 cases and 2,009 controls) and identified 793 variants in 278 EMT-related genes that were nominally (P < 0.05) associated with invasive EOC. These SNPs were then genotyped in a larger study of 14,525 invasive-cancer patients and 23,447 controls. A P-value <0.05 and a false discovery rate (FDR) <0.2 were considered statistically significant. In the larger dataset, GPC6/GPC5 rs17702471 was associated with the endometrioid subtype among Caucasians (odds ratio (OR) = 1.16, 95% CI = 1.07-1.25, P = 0.0003, FDR = 0.19), whereas F8 rs7053448 (OR = 1.69, 95% CI = 1.27-2.24, P = 0.0003, FDR = 0.12), F8 rs7058826 (OR = 1.69, 95% CI = 1.27-2.24, P = 0.0003, FDR = 0.12), and CAPN13 rs1983383 (OR = 0.79, 95% CI = 0.69-0.90, P = 0.0005, FDR = 0.12) were associated with combined invasive EOC among Asians. In silico functional analyses revealed that GPC6/GPC5 rs17702471 coincided with DNA regulatory elements. These results suggest that EMT gene variants do not appear to play a significant role in the susceptibility to EOC. © 2015 WILEY PERIODICALS, INC.

  9. Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

    Science.gov (United States)

    Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R

    2016-05-27

    Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and

  10. DeepBase: annotation and discovery of microRNAs and other noncoding RNAs from deep-sequencing data.

    Science.gov (United States)

    Yang, Jian-Hua; Qu, Liang-Hu

    2012-01-01

    Recent advances in high-throughput deep-sequencing technology have produced large numbers of short and long RNA sequences and enabled the detection and profiling of known and novel microRNAs (miRNAs) and other noncoding RNAs (ncRNAs) at unprecedented sensitivity and depth. In this chapter, we describe the use of deepBase, a database that we have developed to integrate all public deep-sequencing data and to facilitate the comprehensive annotation and discovery of miRNAs and other ncRNAs from these data. deepBase provides an integrative, interactive, and versatile web graphical interface to evaluate miRBase-annotated miRNA genes and other known ncRNAs, explores the expression patterns of miRNAs and other ncRNAs, and discovers novel miRNAs and other ncRNAs from deep-sequencing data. deepBase also provides a deepView genome browser to comparatively analyze these data at multiple levels. deepBase is available at http://deepbase.sysu.edu.cn/.

  11. How Facilitation May Interfere with Ecological Speciation

    Directory of Open Access Journals (Sweden)

    P. Liancourt

    2012-01-01

    Full Text Available Compared to the vast literature linking competitive interactions and speciation, attempts to understand the role of facilitation for evolutionary diversification remain scarce. Yet, community ecologists now recognize the importance of positive interactions within plant communities. Here, we examine how facilitation may interfere with the mechanisms of ecological speciation. We argue that facilitation is likely to (1 maintain gene flow among incipient species by enabling cooccurrence of adapted and maladapted forms in marginal habitats and (2 increase fitness of introgressed forms and limit reinforcement in secondary contact zones. Alternatively, we present how facilitation may favour colonization of marginal habitats and thus enhance local adaptation and ecological speciation. Therefore, facilitation may impede or pave the way for ecological speciation. Using a simple spatially and genetically explicit modelling framework, we illustrate and propose some first testable ideas about how, when, and where facilitation may act as a cohesive force for ecological speciation. These hypotheses and the modelling framework proposed should stimulate further empirical and theoretical research examining the role of both competitive and positive interactions in the formation of incipient species.

  12. Vorinostat, a histone deacetylase inhibitor, facilitates fear extinction and enhances expression of the hippocampal NR2B-containing NMDA receptor gene.

    Science.gov (United States)

    Fujita, Yosuke; Morinobu, Shigeru; Takei, Shiro; Fuchikami, Manabu; Matsumoto, Tomoya; Yamamoto, Shigeto; Yamawaki, Shigeto

    2012-05-01

    Histone acetylation, which alters the compact chromatin structure and changes the accessibility of DNA to regulatory proteins, is emerging as a fundamental mechanism for regulating gene expression. Histone deacetylase (HDAC) inhibitors increase histone acetylation and enhance fear extinction. In this study, we examined whether vorinostat, an HDAC inhibitor, facilitates fear extinction, using a contextual fear conditioning (FC) paradigm, in Sprague-Dawley rats. We found that vorinostat facilitated fear extinction. Next, the levels of global acetylated histone H3 and H4 were measured by Western blotting. We also assessed the effect of vorinostat on the hippocampal levels of NMDA receptor mRNA by real-time quantitative PCR (RT-PCR) and protein by Western blotting. 2 h after vorinostat administration, the levels acetylated histones and NR2B mRNA, but not NR1 or NR2A mRNA, were elevated in the hippocampus. The NR2B protein level was elevated 4 h after vorinostat administration. Last, we investigated the levels of acetylated histones and phospho-CREB (p-CREB) binding at the promoter of the NR2B gene using the chromatin immunoprecipitation (ChIP) assay followed by RT-PCR. The ChIP assay revealed increases in the levels of acetylated histones and they were accompanied by enhanced binding of p-CREB to its binding site at the promoter of the NR2B gene 2 h after vorinostat administration. These findings suggest that vorinostat increases the expression of NR2B in the hippocampus by enhancing histone acetylation, and this process may be implicated in fear extinction. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Identification of rat lung-specific microRNAs by microRNA microarray: valuable discoveries for the facilitation of lung research

    Directory of Open Access Journals (Sweden)

    Chintagari Narendranath

    2007-01-01

    Full Text Available Abstract Background An important mechanism for gene regulation utilizes small non-coding RNAs called microRNAs (miRNAs. These small RNAs play important roles in tissue development, cell differentiation and proliferation, lipid and fat metabolism, stem cells, exocytosis, diseases and cancers. To date, relatively little is known about functions of miRNAs in the lung except lung cancer. Results In this study, we utilized a rat miRNA microarray containing 216 miRNA probes, printed in-house, to detect the expression of miRNAs in the rat lung compared to the rat heart, brain, liver, kidney and spleen. Statistical analysis using Significant Analysis of Microarray (SAM and Tukey Honestly Significant Difference (HSD revealed 2 miRNAs (miR-195 and miR-200c expressed specifically in the lung and 9 miRNAs co-expressed in the lung and another organ. 12 selected miRNAs were verified by Northern blot analysis. Conclusion The identified lung-specific miRNAs from this work will facilitate functional studies of miRNAs during normal physiological and pathophysiological processes of the lung.

  14. Identification of rat lung-specific microRNAs by micoRNA microarray: valuable discoveries for the facilitation of lung research.

    Science.gov (United States)

    Wang, Yang; Weng, Tingting; Gou, Deming; Chen, Zhongming; Chintagari, Narendranath Reddy; Liu, Lin

    2007-01-24

    An important mechanism for gene regulation utilizes small non-coding RNAs called microRNAs (miRNAs). These small RNAs play important roles in tissue development, cell differentiation and proliferation, lipid and fat metabolism, stem cells, exocytosis, diseases and cancers. To date, relatively little is known about functions of miRNAs in the lung except lung cancer. In this study, we utilized a rat miRNA microarray containing 216 miRNA probes, printed in-house, to detect the expression of miRNAs in the rat lung compared to the rat heart, brain, liver, kidney and spleen. Statistical analysis using Significant Analysis of Microarray (SAM) and Tukey Honestly Significant Difference (HSD) revealed 2 miRNAs (miR-195 and miR-200c) expressed specifically in the lung and 9 miRNAs co-expressed in the lung and another organ. 12 selected miRNAs were verified by Northern blot analysis. The identified lung-specific miRNAs from this work will facilitate functional studies of miRNAs during normal physiological and pathophysiological processes of the lung.

  15. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Science.gov (United States)

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  16. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Directory of Open Access Journals (Sweden)

    Theresa A Hill

    Full Text Available The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs. Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP. Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and

  17. Construction of functional linkage gene networks by data integration.

    Science.gov (United States)

    Linghu, Bolan; Franzosa, Eric A; Xia, Yu

    2013-01-01

    Networks of functional associations between genes have recently been successfully used for gene function and disease-related research. A typical approach for constructing such functional linkage gene networks (FLNs) is based on the integration of diverse high-throughput functional genomics datasets. Data integration is a nontrivial task due to the heterogeneous nature of the different data sources and their variable accuracy and completeness. The presence of correlations between data sources also adds another layer of complexity to the integration process. In this chapter we discuss an approach for constructing a human FLN from data integration and a subsequent application of the FLN to novel disease gene discovery. Similar approaches can be applied to nonhuman species and other discovery tasks.

  18. The development of high-content screening (HCS) technology and its importance to drug discovery.

    Science.gov (United States)

    Fraietta, Ivan; Gasparri, Fabio

    2016-01-01

    High-content screening (HCS) was introduced about twenty years ago as a promising analytical approach to facilitate some critical aspects of drug discovery. Its application has spread progressively within the pharmaceutical industry and academia to the point that it today represents a fundamental tool in supporting drug discovery and development. Here, the authors review some of significant progress in the HCS field in terms of biological models and assay readouts. They highlight the importance of high-content screening in drug discovery, as testified by its numerous applications in a variety of therapeutic areas: oncology, infective diseases, cardiovascular and neurodegenerative diseases. They also dissect the role of HCS technology in different phases of the drug discovery pipeline: target identification, primary compound screening, secondary assays, mechanism of action studies and in vitro toxicology. Recent advances in cellular assay technologies, such as the introduction of three-dimensional (3D) cultures, induced pluripotent stem cells (iPSCs) and genome editing technologies (e.g., CRISPR/Cas9), have tremendously expanded the potential of high-content assays to contribute to the drug discovery process. Increasingly predictive cellular models and readouts, together with the development of more sophisticated and affordable HCS readers, will further consolidate the role of HCS technology in drug discovery.

  19. Knowledge Discovery from Vibration Measurements

    Directory of Open Access Journals (Sweden)

    Jun Deng

    2014-01-01

    Full Text Available The framework as well as the particular algorithms of pattern recognition process is widely adopted in structural health monitoring (SHM. However, as a part of the overall process of knowledge discovery from data bases (KDD, the results of pattern recognition are only changes and patterns of changes of data features. In this paper, based on the similarity between KDD and SHM and considering the particularity of SHM problems, a four-step framework of SHM is proposed which extends the final goal of SHM from detecting damages to extracting knowledge to facilitate decision making. The purposes and proper methods of each step of this framework are discussed. To demonstrate the proposed SHM framework, a specific SHM method which is composed by the second order structural parameter identification, statistical control chart analysis, and system reliability analysis is then presented. To examine the performance of this SHM method, real sensor data measured from a lab size steel bridge model structure are used. The developed four-step framework of SHM has the potential to clarify the process of SHM to facilitate the further development of SHM techniques.

  20. GeoSearch: A lightweight broking middleware for geospatial resources discovery

    Science.gov (United States)

    Gui, Z.; Yang, C.; Liu, K.; Xia, J.

    2012-12-01

    With petabytes of geodata, thousands of geospatial web services available over the Internet, it is critical to support geoscience research and applications by finding the best-fit geospatial resources from the massive and heterogeneous resources. Past decades' developments witnessed the operation of many service components to facilitate geospatial resource management and discovery. However, efficient and accurate geospatial resource discovery is still a big challenge due to the following reasons: 1)The entry barriers (also called "learning curves") hinder the usability of discovery services to end users. Different portals and catalogues always adopt various access protocols, metadata formats and GUI styles to organize, present and publish metadata. It is hard for end users to learn all these technical details and differences. 2)The cost for federating heterogeneous services is high. To provide sufficient resources and facilitate data discovery, many registries adopt periodic harvesting mechanism to retrieve metadata from other federated catalogues. These time-consuming processes lead to network and storage burdens, data redundancy, and also the overhead of maintaining data consistency. 3)The heterogeneous semantics issues in data discovery. Since the keyword matching is still the primary search method in many operational discovery services, the search accuracy (precision and recall) is hard to guarantee. Semantic technologies (such as semantic reasoning and similarity evaluation) offer a solution to solve these issues. However, integrating semantic technologies with existing service is challenging due to the expandability limitations on the service frameworks and metadata templates. 4)The capabilities to help users make final selection are inadequate. Most of the existing search portals lack intuitive and diverse information visualization methods and functions (sort, filter) to present, explore and analyze search results. Furthermore, the presentation of the value

  1. Facilitating Students' Interaction with Real Gas Properties Using a Discovery-Based Approach and Molecular Dynamics Simulations

    Science.gov (United States)

    Sweet, Chelsea; Akinfenwa, Oyewumi; Foley, Jonathan J., IV

    2018-01-01

    We present an interactive discovery-based approach to studying the properties of real gases using simple, yet realistic, molecular dynamics software. Use of this approach opens up a variety of opportunities for students to interact with the behaviors and underlying theories of real gases. Students can visualize gas behavior under a variety of…

  2. Insights into inner ear-specific gene regulation: epigenetics and non-coding RNAs in inner ear development and regeneration

    Science.gov (United States)

    Avraham, Karen B.

    2016-01-01

    The vertebrate inner ear houses highly specialized sensory organs, tuned to detect and encode sound, head motion and gravity. Gene expression programs under the control of transcription factors orchestrate the formation and specialization of the non-sensory inner ear labyrinth and its sensory constituents. More recently, epigenetic factors and non-coding RNAs emerged as an additional layer of gene regulation, both in inner ear development and disease. In this review, we provide an overview on how epigenetic modifications and non-coding RNAs, in particular microRNAs (miRNAs), influence gene expression and summarize recent discoveries that highlight their critical role in the proper formation of the inner ear labyrinth and its sensory organs. In contrast to non-mammalian vertebrates, adult mammals lack the ability to regenerate inner ear mechano-sensory hair cells. Finally, we discuss recent insights into how epigenetic factors and miRNAs may facilitate, or in the case of mammals, restrict sensory hair cell regeneration. PMID:27836639

  3. AKT phosphorylates H3-threonine 45 to facilitate termination of gene transcription in response to DNA damage.

    Science.gov (United States)

    Lee, Jong-Hyuk; Kang, Byung-Hee; Jang, Hyonchol; Kim, Tae Wan; Choi, Jinmi; Kwak, Sojung; Han, Jungwon; Cho, Eun-Jung; Youn, Hong-Duk

    2015-05-19

    Post-translational modifications of core histones affect various cellular processes, primarily through transcription. However, their relationship with the termination of transcription has remained largely unknown. In this study, we show that DNA damage-activated AKT phosphorylates threonine 45 of core histone H3 (H3-T45). By genome-wide chromatin immunoprecipitation sequencing (ChIP-seq) analysis, H3-T45 phosphorylation was distributed throughout DNA damage-responsive gene loci, particularly immediately after the transcription termination site. H3-T45 phosphorylation pattern showed close-resemblance to that of RNA polymerase II C-terminal domain (CTD) serine 2 phosphorylation, which establishes the transcription termination signal. AKT1 was more effective than AKT2 in phosphorylating H3-T45. Blocking H3-T45 phosphorylation by inhibiting AKT or through amino acid substitution limited RNA decay downstream of mRNA cleavage sites and decreased RNA polymerase II release from chromatin. Our findings suggest that AKT-mediated phosphorylation of H3-T45 regulates the processing of the 3' end of DNA damage-activated genes to facilitate transcriptional termination. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. SOA Modeling Patterns for Service Oriented Discovery and Analysis

    CERN Document Server

    Bell, Michael

    2010-01-01

    Learn the essential tools for developing a sound service-oriented architecture. SOA Modeling Patterns for Service-Oriented Discovery and Analysis introduces a universal, easy-to-use, and nimble SOA modeling language to facilitate the service identification and examination life cycle stage. This business and technological vocabulary will benefit your service development endeavors and foster organizational software asset reuse and consolidation, and reduction of expenditure. Whether you are a developer, business architect, technical architect, modeler, business analyst, team leader, or manager,

  5. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  6. Discovery of a novel gene involved in autolysis of Clostridium cells.

    Science.gov (United States)

    Yang, Liejian; Bao, Guanhui; Zhu, Yan; Dong, Hongjun; Zhang, Yanping; Li, Yin

    2013-06-01

    Cell autolysis plays important physiological roles in the life cycle of clostridial cells. Understanding the genetic basis of the autolysis phenomenon of pathogenic Clostridium or solvent producing Clostridium cells might provide new insights into this important species. Genes that might be involved in autolysis of Clostridium acetobutylicum, a model clostridial species, were investigated in this study. Twelve putative autolysin genes were predicted in C. acetobutylicum DSM 1731 genome through bioinformatics analysis. Of these 12 genes, gene SMB_G3117 was selected for testing the in tracellular autolysin activity, growth profile, viable cell numbers, and cellular morphology. We found that overexpression of SMB_G3117 gene led to earlier ceased growth, significantly increased number of dead cells, and clear electrolucent cavities, while disruption of SMB_G3117 gene exhibited remarkably reduced intracellular autolysin activity. These results indicate that SMB_G3117 is a novel gene involved in cellular autolysis of C. acetobutylicum.

  7. Building A NGS Genomic Resource: Towards Molecular Breeding In L. Perenne

    DEFF Research Database (Denmark)

    Ruttink, Tom; Roldán-Ruiz, Isabel; Asp, Torben

    To advance the application of molecular breeding in Lolium perenne, we have generated a sequence resource to facilitate gene discovery and SNP marker development. Illumina GAII transcriptome sequencing was performed on meristem-enriched samples of 14 Lolium genotypes. De novo assemblies for indiv......To advance the application of molecular breeding in Lolium perenne, we have generated a sequence resource to facilitate gene discovery and SNP marker development. Illumina GAII transcriptome sequencing was performed on meristem-enriched samples of 14 Lolium genotypes. De novo assemblies...... of SNP markers in selected candidate genes. In parallel, a germplasm collection of 602 Lolium genotypes was established and is being phenotyped for plant architecture, reproductive characteristics, flowering time, and forage quality traits. We will test through association genetics whether phenotypic...

  8. Using transcriptomics to guide lead optimization in drug discovery projects: Lessons learned from the QSTAR project.

    Science.gov (United States)

    Verbist, Bie; Klambauer, Günter; Vervoort, Liesbet; Talloen, Willem; Shkedy, Ziv; Thas, Olivier; Bender, Andreas; Göhlmann, Hinrich W H; Hochreiter, Sepp

    2015-05-01

    The pharmaceutical industry is faced with steadily declining R&D efficiency which results in fewer drugs reaching the market despite increased investment. A major cause for this low efficiency is the failure of drug candidates in late-stage development owing to safety issues or previously undiscovered side-effects. We analyzed to what extent gene expression data can help to de-risk drug development in early phases by detecting the biological effects of compounds across disease areas, targets and scaffolds. For eight drug discovery projects within a global pharmaceutical company, gene expression data were informative and able to support go/no-go decisions. Our studies show that gene expression profiling can detect adverse effects of compounds, and is a valuable tool in early-stage drug discovery decision making. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  9. Polar Domain Discovery with Sparkler

    Science.gov (United States)

    Duerr, R.; Khalsa, S. J. S.; Mattmann, C. A.; Ottilingam, N. K.; Singh, K.; Lopez, L. A.

    2017-12-01

    The scientific web is vast and ever growing. It encompasses millions of textual, scientific and multimedia documents describing research in a multitude of scientific streams. Most of these documents are hidden behind forms which require user action to retrieve and thus can't be directly accessed by content crawlers. These documents are hosted on web servers across the world, most often on outdated hardware and network infrastructure. Hence it is difficult and time-consuming to aggregate documents from the scientific web, especially those relevant to a specific domain. Thus generating meaningful domain-specific insights is currently difficult. We present an automated discovery system (Figure 1) using Sparkler, an open-source, extensible, horizontally scalable crawler which facilitates high throughput and focused crawling of documents pertinent to a particular domain such as information about polar regions. With this set of highly domain relevant documents, we show that it is possible to answer analytical questions about that domain. Our domain discovery algorithm leverages prior domain knowledge to reach out to commercial/scientific search engines to generate seed URLs. Subject matter experts then annotate these seed URLs manually on a scale from highly relevant to irrelevant. We leverage this annotated dataset to train a machine learning model which predicts the `domain relevance' of a given document. We extend Sparkler with this model to focus crawling on documents relevant to that domain. Sparkler avoids disruption of service by 1) partitioning URLs by hostname such that every node gets a different host to crawl and by 2) inserting delays between subsequent requests. With an NSF-funded supercomputer Wrangler, we scaled our domain discovery pipeline to crawl about 200k polar specific documents from the scientific web, within a day.

  10. DDMGD: the database of text-mined associations between genes methylated in diseases from different species.

    Science.gov (United States)

    Bin Raies, Arwa; Mansour, Hicham; Incitti, Roberto; Bajic, Vladimir B

    2015-01-01

    Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD's scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Identification of Phosphoglycerate Kinase 1 (PGK1 as a reference gene for quantitative gene expression measurements in human blood RNA

    Directory of Open Access Journals (Sweden)

    Unger Elizabeth R

    2011-09-01

    gene expression results from blood RNA collected and processed by different methods with the intention of biomarker discovery. Results of this study should facilitate large-scale molecular epidemiologic studies using blood RNA as the target of quantitative gene expression measurements.

  12. Transient transformation meets gene function discovery: the strawberry fruit case

    Directory of Open Access Journals (Sweden)

    Michela eGuidarelli

    2015-06-01

    Full Text Available Beside the well known nutritional and health benefits, strawberry (Fragaria X ananassa crop draws increasing attention as plant model system for the Rosaceae family, due to the short generation time, the rapid in vitro regeneration, and to the availability of the genome sequence of F. X ananassa and of the closely related F. vesca species. In the last years, the use of high-throughput sequence technologies provided large amounts of molecular information on the genes possibly related to several biological processes of this crop. Nevertheless, the function of most genes or gene products is still poorly understood and needs investigation. Transient transformation technology provides a powerful tool to study gene function in vivo, avoiding difficult drawbacks that typically affect the stable transformation protocols, such as transformation efficiency, transformants selection and regeneration. In this review we provide an overview of the use of transient expression in the investigation of the function of genes important for strawberry fruit development, defence and nutritional properties. The technical aspects related to an efficient use of this technique are described, and the possible impact and application in strawberry crop improvement are discussed.

  13. Empirical study of supervised gene screening

    Directory of Open Access Journals (Sweden)

    Ma Shuangge

    2006-12-01

    Full Text Available Abstract Background Microarray studies provide a way of linking variations of phenotypes with their genetic causations. Constructing predictive models using high dimensional microarray measurements usually consists of three steps: (1 unsupervised gene screening; (2 supervised gene screening; and (3 statistical model building. Supervised gene screening based on marginal gene ranking is commonly used to reduce the number of genes in the model building. Various simple statistics, such as t-statistic or signal to noise ratio, have been used to rank genes in the supervised screening. Despite of its extensive usage, statistical study of supervised gene screening remains scarce. Our study is partly motivated by the differences in gene discovery results caused by using different supervised gene screening methods. Results We investigate concordance and reproducibility of supervised gene screening based on eight commonly used marginal statistics. Concordance is assessed by the relative fractions of overlaps between top ranked genes screened using different marginal statistics. We propose a Bootstrap Reproducibility Index, which measures reproducibility of individual genes under the supervised screening. Empirical studies are based on four public microarray data. We consider the cases where the top 20%, 40% and 60% genes are screened. Conclusion From a gene discovery point of view, the effect of supervised gene screening based on different marginal statistics cannot be ignored. Empirical studies show that (1 genes passed different supervised screenings may be considerably different; (2 concordance may vary, depending on the underlying data structure and percentage of selected genes; (3 evaluated with the Bootstrap Reproducibility Index, genes passed supervised screenings are only moderately reproducible; and (4 concordance cannot be improved by supervised screening based on reproducibility.

  14. RNA Editing and Drug Discovery for Cancer Therapy

    Directory of Open Access Journals (Sweden)

    Wei-Hsuan Huang

    2013-01-01

    Full Text Available RNA editing is vital to provide the RNA and protein complexity to regulate the gene expression. Correct RNA editing maintains the cell function and organism development. Imbalance of the RNA editing machinery may lead to diseases and cancers. Recently, RNA editing has been recognized as a target for drug discovery although few studies targeting RNA editing for disease and cancer therapy were reported in the field of natural products. Therefore, RNA editing may be a potential target for therapeutic natural products. In this review, we provide a literature overview of the biological functions of RNA editing on gene expression, diseases, cancers, and drugs. The bioinformatics resources of RNA editing were also summarized.

  15. Flightless I (Drosophila) homolog facilitates chromatin accessibility of the estrogen receptor α target genes in MCF-7 breast cancer cells

    Energy Technology Data Exchange (ETDEWEB)

    Jeong, Kwang Won, E-mail: kwjeong@gachon.ac.kr

    2014-04-04

    Highlights: • H3K4me3 and Pol II binding at TFF1 promoter were reduced in FLII-depleted MCF-7 cells. • FLII is required for chromatin accessibility of the enhancer of ERalpha target genes. • Depletion of FLII causes inhibition of proliferation of MCF-7 cells. - Abstract: The coordinated activities of multiple protein complexes are essential to the remodeling of chromatin structure and for the recruitment of RNA polymerase II (Pol II) to the promoter in order to facilitate the initiation of transcription in nuclear receptor-mediated gene expression. Flightless I (Drosophila) homolog (FLII), a nuclear receptor coactivator, is associated with the SWI/SNF-chromatin remodeling complex during estrogen receptor (ER)α-mediated transcription. However, the function of FLII in estrogen-induced chromatin opening has not been fully explored. Here, we show that FLII plays a critical role in establishing active histone modification marks and generating the open chromatin structure of ERα target genes. We observed that the enhancer regions of ERα target genes are heavily occupied by FLII, and histone H3K4me3 and Pol II binding induced by estrogen are decreased in FLII-depleted MCF-7 cells. Furthermore, formaldehyde-assisted isolation of regulatory elements (FAIRE)-quantitative polymerase chain reaction (qPCR) experiments showed that depletion of FLII resulted in reduced chromatin accessibility of multiple ERα target genes. These data suggest FLII as a key regulator of ERα-mediated transcription through its role in regulating chromatin accessibility for the binding of RNA Polymerase II and possibly other transcriptional coactivators.

  16. Flightless I (Drosophila) homolog facilitates chromatin accessibility of the estrogen receptor α target genes in MCF-7 breast cancer cells

    International Nuclear Information System (INIS)

    Jeong, Kwang Won

    2014-01-01

    Highlights: • H3K4me3 and Pol II binding at TFF1 promoter were reduced in FLII-depleted MCF-7 cells. • FLII is required for chromatin accessibility of the enhancer of ERalpha target genes. • Depletion of FLII causes inhibition of proliferation of MCF-7 cells. - Abstract: The coordinated activities of multiple protein complexes are essential to the remodeling of chromatin structure and for the recruitment of RNA polymerase II (Pol II) to the promoter in order to facilitate the initiation of transcription in nuclear receptor-mediated gene expression. Flightless I (Drosophila) homolog (FLII), a nuclear receptor coactivator, is associated with the SWI/SNF-chromatin remodeling complex during estrogen receptor (ER)α-mediated transcription. However, the function of FLII in estrogen-induced chromatin opening has not been fully explored. Here, we show that FLII plays a critical role in establishing active histone modification marks and generating the open chromatin structure of ERα target genes. We observed that the enhancer regions of ERα target genes are heavily occupied by FLII, and histone H3K4me3 and Pol II binding induced by estrogen are decreased in FLII-depleted MCF-7 cells. Furthermore, formaldehyde-assisted isolation of regulatory elements (FAIRE)-quantitative polymerase chain reaction (qPCR) experiments showed that depletion of FLII resulted in reduced chromatin accessibility of multiple ERα target genes. These data suggest FLII as a key regulator of ERα-mediated transcription through its role in regulating chromatin accessibility for the binding of RNA Polymerase II and possibly other transcriptional coactivators

  17. The biological knowledge discovery by PCCF measure and PCA-F projection.

    Science.gov (United States)

    Jia, Xingang; Zhu, Guanqun; Han, Qiuhong; Lu, Zuhong

    2017-01-01

    In the process of biological knowledge discovery, PCA is commonly used to complement the clustering analysis, but PCA typically gives the poor visualizations for most gene expression data sets. Here, we propose a PCCF measure, and use PCA-F to display clusters of PCCF, where PCCF and PCA-F are modeled from the modified cumulative probabilities of genes. From the analysis of simulated and experimental data sets, we demonstrate that PCCF is more appropriate and reliable for analyzing gene expression data compared to other commonly used distances or similarity measures, and PCA-F is a good visualization technique for identifying clusters of PCCF, where we aim at such data sets that the expression values of genes are collected at different time points.

  18. Topology Discovery Using Cisco Discovery Protocol

    OpenAIRE

    Rodriguez, Sergio R.

    2009-01-01

    In this paper we address the problem of discovering network topology in proprietary networks. Namely, we investigate topology discovery in Cisco-based networks. Cisco devices run Cisco Discovery Protocol (CDP) which holds information about these devices. We first compare properties of topologies that can be obtained from networks deploying CDP versus Spanning Tree Protocol (STP) and Management Information Base (MIB) Forwarding Database (FDB). Then we describe a method of discovering topology ...

  19. Yeast surface display platform for rapid discovery of conformationally selective nanobodies

    DEFF Research Database (Denmark)

    McMahon, Conor; Baier, Alexander S.; Pascolutti, Roberta

    2018-01-01

    this problem, we report a fully in vitro platform for nanobody discovery based on yeast surface display. We provide a blueprint for identifying nanobodies, demonstrate the utility of the library by crystallizing a nanobody with its antigen, and most importantly, we utilize the platform to discover...... conformationally selective nanobodies to two distinct human GPCRs. To facilitate broad deployment of this platform, the library and associated protocols are freely available for nonprofit research....

  20. GSEH: A Novel Approach to Select Prostate Cancer-Associated Genes Using Gene Expression Heterogeneity.

    Science.gov (United States)

    Kim, Hyunjin; Choi, Sang-Min; Park, Sanghyun

    2018-01-01

    When a gene shows varying levels of expression among normal people but similar levels in disease patients or shows similar levels of expression among normal people but different levels in disease patients, we can assume that the gene is associated with the disease. By utilizing this gene expression heterogeneity, we can obtain additional information that abets discovery of disease-associated genes. In this study, we used collaborative filtering to calculate the degree of gene expression heterogeneity between classes and then scored the genes on the basis of the degree of gene expression heterogeneity to find "differentially predicted" genes. Through the proposed method, we discovered more prostate cancer-associated genes than 10 comparable methods. The genes prioritized by the proposed method are potentially significant to biological processes of a disease and can provide insight into them.

  1. Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison.

    Science.gov (United States)

    Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S; Sinha, Saurabh

    2011-12-01

    Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, 'enhancers'), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for 'motif-blind' CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to 'supervise' the search. We propose a new statistical method, based on 'Interpolated Markov Models', for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. © The Author(s) 2011. Published by Oxford University Press.

  2. Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus.

    Science.gov (United States)

    Woo, Patrick C Y; Lau, Susanna K P; Lam, Carol S F; Lau, Candy C Y; Tsang, Alan K L; Lau, John H N; Bai, Ru; Teng, Jade L L; Tsang, Chris C C; Wang, Ming; Zheng, Bo-Jian; Chan, Kwok-Hung; Yuen, Kwok-Yung

    2012-04-01

    Recently, we reported the discovery of three novel coronaviruses, bulbul coronavirus HKU11, thrush coronavirus HKU12, and munia coronavirus HKU13, which were identified as representatives of a novel genus, Deltacoronavirus, in the subfamily Coronavirinae. In this territory-wide molecular epidemiology study involving 3,137 mammals and 3,298 birds, we discovered seven additional novel deltacoronaviruses in pigs and birds, which we named porcine coronavirus HKU15, white-eye coronavirus HKU16, sparrow coronavirus HKU17, magpie robin coronavirus HKU18, night heron coronavirus HKU19, wigeon coronavirus HKU20, and common moorhen coronavirus HKU21. Complete genome sequencing and comparative genome analysis showed that the avian and mammalian deltacoronaviruses have similar genome characteristics and structures. They all have relatively small genomes (25.421 to 26.674 kb), the smallest among all coronaviruses. They all have a single papain-like protease domain in the nsp3 gene; an accessory gene, NS6 open reading frame (ORF), located between the M and N genes; and a variable number of accessory genes (up to four) downstream of the N gene. Moreover, they all have the same putative transcription regulatory sequence of ACACCA. Molecular clock analysis showed that the most recent common ancestor of all coronaviruses was estimated at approximately 8100 BC, and those of Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus were at approximately 2400 BC, 3300 BC, 2800 BC, and 3000 BC, respectively. From our studies, it appears that bats and birds, the warm blooded flying vertebrates, are ideal hosts for the coronavirus gene source, bats for Alphacoronavirus and Betacoronavirus and birds for Gammacoronavirus and Deltacoronavirus, to fuel coronavirus evolution and dissemination.

  3. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides

    DEFF Research Database (Denmark)

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe

    2016-01-01

    -dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i......This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes...... and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two...

  4. A monograph proposing the use of canine mammary tumours as a model for the study of hereditary breast cancer susceptibility genes in humans.

    Science.gov (United States)

    Goebel, Katie; Merner, Nancy D

    2017-05-01

    Canines are excellent models for cancer studies due to their similar physiology and genomic sequence to humans, companion status and limited intra-breed heterogeneity. Due to their affliction to mammary cancers, canines can serve as powerful genetic models of hereditary breast cancers. Variants within known human breast cancer susceptibility genes only explain a fraction of familial cases. Thus, further discovery is necessary but such efforts have been thwarted by genetic heterogeneity. Reducing heterogeneity is key, and studying isolated human populations have helped in the endeavour. An alternative is to study dog pedigrees, since artificial selection has resulted in extreme homogeneity. Identifying the genetic predisposition to canine mammary tumours can translate to human discoveries - a strategy currently underutilized. To explore this potential, we reviewed published canine mammary tumour genetic studies and proposed benefits of next generation sequencing canine cohorts to facilitate moving beyond incremental advances.

  5. Down-Regulation of Gene Expression by RNA-Induced Gene Silencing

    Science.gov (United States)

    Travella, Silvia; Keller, Beat

    Down-regulation of endogenous genes via post-transcriptional gene silencing (PTGS) is a key to the characterization of gene function in plants. Many RNA-based silencing mechanisms such as post-transcriptional gene silencing, co-suppression, quelling, and RNA interference (RNAi) have been discovered among species of different kingdoms (plants, fungi, and animals). One of the most interesting discoveries was RNAi, a sequence-specific gene-silencing mechanism initiated by the introduction of double-stranded RNA (dsRNA), homologous in sequence to the silenced gene, which triggers degradation of mRNA. Infection of plants with modified viruses can also induce RNA silencing and is referred to as virus-induced gene silencing (VIGS). In contrast to insertional mutagenesis, these emerging new reverse genetic approaches represent a powerful tool for exploring gene function and for manipulating gene expression experimentally in cereal species such as barley and wheat. We examined how RNAi and VIGS have been used to assess gene function in barley and wheat, including molecular mechanisms involved in the process and available methodological elements, such as vectors, inoculation procedures, and analysis of silenced phenotypes.

  6. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  7. Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data.

    Science.gov (United States)

    Mamykina, Lena; Heitkemper, Elizabeth M; Smaldone, Arlene M; Kukafka, Rita; Cole-Lewis, Heather J; Davidson, Patricia G; Mynatt, Elizabeth D; Cassells, Andrea; Tobin, Jonathan N; Hripcsak, George

    2017-12-01

    To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals' activities and changes in their blood glucose levels that the participants referred to as "cause and effect". This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can

  8. 14 CFR 406.143 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 4 2010-01-01 2010-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...

  9. The fragile x mental retardation syndrome 20 years after the FMR1 gene discovery: an expanding universe of knowledge.

    Science.gov (United States)

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-08-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations.

  10. The Fragile X Mental Retardation Syndrome 20 Years After the FMR1 Gene Discovery: an Expanding Universe of Knowledge

    Science.gov (United States)

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-01-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations. PMID:21912443

  11. Integrated physical map of bread wheat chromosome arm 7DS to facilitate gene cloning and comparative studies.

    Science.gov (United States)

    Tulpová, Zuzana; Luo, Ming-Cheng; Toegelová, Helena; Visendi, Paul; Hayashi, Satomi; Vojta, Petr; Paux, Etienne; Kilian, Andrzej; Abrouk, Michaël; Bartoš, Jan; Hajdúch, Marián; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana

    2018-03-08

    Bread wheat (Triticum aestivum L.) is a staple food for a significant part of the world's population. The growing demand on its production can be satisfied by improving yield and resistance to biotic and abiotic stress. Knowledge of the genome sequence would aid in discovering genes and QTLs underlying these traits and provide a basis for genomics-assisted breeding. Physical maps and BAC clones associated with them have been valuable resources from which to generate a reference genome of bread wheat and to assist map-based gene cloning. As a part of a joint effort coordinated by the International Wheat Genome Sequencing Consortium, we have constructed a BAC-based physical map of bread wheat chromosome arm 7DS consisting of 895 contigs and covering 94% of its estimated length. By anchoring BAC contigs to one radiation hybrid map and three high resolution genetic maps, we assigned 73% of the assembly to a distinct genomic position. This map integration, interconnecting a total of 1713 markers with ordered and sequenced BAC clones from a minimal tiling path, provides a tool to speed up gene cloning in wheat. The process of physical map assembly included the integration of the 7DS physical map with a whole-genome physical map of Aegilops tauschii and a 7DS Bionano genome map, which together enabled efficient scaffolding of physical-map contigs, even in the non-recombining region of the genetic centromere. Moreover, this approach facilitated a comparison of bread wheat and its ancestor at BAC-contig level and revealed a reconstructed region in the 7DS pericentromere. Copyright © 2018. Published by Elsevier B.V.

  12. Higgs Discovery

    DEFF Research Database (Denmark)

    Sannino, Francesco

    2013-01-01

    has been challenged by the discovery of a not-so-heavy Higgs-like state. I will therefore review the recent discovery \\cite{Foadi:2012bb} that the standard model top-induced radiative corrections naturally reduce the intrinsic non-perturbative mass of the composite Higgs state towards the desired...... via first principle lattice simulations with encouraging results. The new findings show that the recent naive claims made about new strong dynamics at the electroweak scale being disfavoured by the discovery of a not-so-heavy composite Higgs are unwarranted. I will then introduce the more speculative......I discuss the impact of the discovery of a Higgs-like state on composite dynamics starting by critically examining the reasons in favour of either an elementary or composite nature of this state. Accepting the standard model interpretation I re-address the standard model vacuum stability within...

  13. Bioluminescent bacteria: lux genes as environmental biosensors

    OpenAIRE

    Nunes-Halldorson,Vânia da Silva; Duran,Norma Letícia

    2003-01-01

    Bioluminescent bacteria are widespread in natural environments. Over the years, many researchers have been studying the physiology, biochemistry and genetic control of bacterial bioluminescence. These discoveries have revolutionized the area of Environmental Microbiology through the use of luminescent genes as biosensors for environmental studies. This paper will review the chronology of scientific discoveries on bacterial bioluminescence and the current applications of bioluminescence in env...

  14. Traditional Chinese Medicine-Based Network Pharmacology Could Lead to New Multicompound Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jian Li

    2012-01-01

    Full Text Available Current strategies for drug discovery have reached a bottleneck where the paradigm is generally “one gene, one drug, one disease.” However, using holistic and systemic views, network pharmacology may be the next paradigm in drug discovery. Based on network pharmacology, a combinational drug with two or more compounds could offer beneficial synergistic effects for complex diseases. Interestingly, traditional chinese medicine (TCM has been practicing holistic views for over 3,000 years, and its distinguished feature is using herbal formulas to treat diseases based on the unique pattern classification. Though TCM herbal formulas are acknowledged as a great source for drug discovery, no drug discovery strategies compatible with the multidimensional complexities of TCM herbal formulas have been developed. In this paper, we highlighted some novel paradigms in TCM-based network pharmacology and new drug discovery. A multiple compound drug can be discovered by merging herbal formula-based pharmacological networks with TCM pattern-based disease molecular networks. Herbal formulas would be a source for multiple compound drug candidates, and the TCM pattern in the disease would be an indication for a new drug.

  15. Discovery and replication of gene influences on brain structure using LASSO regression

    Directory of Open Access Journals (Sweden)

    Omid eKohannim

    2012-08-01

    Full Text Available We implemented LASSO (least absolute shrinkage and selection operator regression to evaluate gene effects in genome-wide association studies (GWAS of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI. Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4 and CDH13. The top genes we identified with this method also displayed significant and widespread post-hoc effects on voxelwise, tensor-based morphometry (TBM maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8±2.2 SD years. In exploratory analyses, three selected SNPs in the MACROD2 gene were also significantly associated with performance intelligence quotient (PIQ. Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain.

  16. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  17. Identification of novel type 1 diabetes candidate genes by integrating genome-wide association data, protein-protein interactions, and human pancreatic islet gene expression

    DEFF Research Database (Denmark)

    Bergholdt, Regine; Brorsson, Caroline; Palleja, Albert

    2012-01-01

    Genome-wide association studies (GWAS) have heralded a new era in susceptibility locus discovery in complex diseases. For type 1 diabetes, >40 susceptibility loci have been discovered. However, GWAS do not inevitably lead to identification of the gene or genes in a given locus associated with dis......-cells. Our results provide novel insight to the mechanisms behind type 1 diabetes pathogenesis and, thus, may provide the basis for the design of novel treatment strategies.......Genome-wide association studies (GWAS) have heralded a new era in susceptibility locus discovery in complex diseases. For type 1 diabetes, >40 susceptibility loci have been discovered. However, GWAS do not inevitably lead to identification of the gene or genes in a given locus associated...... with disease, and they do not typically inform the broader context in which the disease genes operate. Here, we integrated type 1 diabetes GWAS data with protein-protein interactions to construct biological networks of relevance for disease. A total of 17 networks were identified. To prioritize...

  18. Glycosyltransferase Gene Expression Profiles Classify Cancer Types and Propose Prognostic Subtypes

    Science.gov (United States)

    Ashkani, Jahanshah; Naidoo, Kevin J.

    2016-05-01

    Aberrant glycosylation in tumours stem from altered glycosyltransferase (GT) gene expression but can the expression profiles of these signature genes be used to classify cancer types and lead to cancer subtype discovery? The differential structural changes to cellular glycan structures are predominantly regulated by the expression patterns of GT genes and are a hallmark of neoplastic cell metamorphoses. We found that the expression of 210 GT genes taken from 1893 cancer patient samples in The Cancer Genome Atlas (TCGA) microarray data are able to classify six cancers; breast, ovarian, glioblastoma, kidney, colon and lung. The GT gene expression profiles are used to develop cancer classifiers and propose subtypes. The subclassification of breast cancer solid tumour samples illustrates the discovery of subgroups from GT genes that match well against basal-like and HER2-enriched subtypes and correlates to clinical, mutation and survival data. This cancer type glycosyltransferase gene signature finding provides foundational evidence for the centrality of glycosylation in cancer.

  19. A hybrid computational method for the discovery of novel reproduction-related genes.

    Science.gov (United States)

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations.

  20. The discovery of the periodic table as a case of simultaneous discovery.

    Science.gov (United States)

    Scerri, Eric

    2015-03-13

    The article examines the question of priority and simultaneous discovery in the context of the discovery of the periodic system. It is argued that rather than being anomalous, simultaneous discovery is the rule. Moreover, I argue that the discovery of the periodic system by at least six authors in over a period of 7 years represents one of the best examples of a multiple discovery. This notion is supported by a new view of the evolutionary development of science through a mechanism that is dubbed Sci-Gaia by analogy with Lovelock's Gaia hypothesis. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  1. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2017-01-01

    In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well...

  2. Gene expression profiling in cervical cancer: identification of novel markers for disease diagnosis and therapy.

    LENUS (Irish Health Repository)

    Martin, Cara M

    2012-02-01

    Cervical cancer, a potentially preventable disease, remains the second most common malignancy in women worldwide. Human papillomavirus is the single most important etiological agent in cervical cancer. HPV contributes to neoplastic progression through the action of two viral oncoproteins E6 and E7, which interfere with critical cell cycle pathways, p53, and retinoblastoma. However, evidence suggests that HPV infection alone is insufficient to induce malignant changes and other host genetic variations are important in the development of cervical cancer. Advances in molecular biology and high throughput gene expression profiling technologies have heralded a new era in biomarker discovery and identification of molecular targets related to carcinogenesis. These advancements have improved our understanding of carcinogenesis and will facilitate screening, early detection, management, and personalised targeted therapy. In this chapter, we have described the use of high density microarrays to assess gene expression profiles in cervical cancer. Using this approach we have identified a number of novel genes which are differentially expressed in cervical cancer, including several genes involved in cell cycle regulation. These include p16ink4a, MCM 3 and 5, CDC6, Geminin, Cyclins A-D, TOPO2A, CDCA1, and BIRC5. We have validated expression of mRNA using real-time PCR and protein by immunohistochemistry.

  3. Key drivers of biomedical innovation in cancer drug discovery

    OpenAIRE

    Huber, Margit A; Kraut, Norbert

    2014-01-01

    Discovery and translational research has led to the identification of a series of ?cancer drivers??genes that, when mutated or otherwise misregulated, can drive malignancy. An increasing number of drugs that directly target such drivers have demonstrated activity in clinical trials and are shaping a new landscape for molecularly targeted cancer therapies. Such therapies rely on molecular and genetic diagnostic tests to detect the presence of a biomarker that predicts response. Here, we highli...

  4. Discovery of genomic intervals that underlie nematode responses to benzimidazoles.

    Science.gov (United States)

    Zamanian, Mostafa; Cook, Daniel E; Zdraljevic, Stefan; Brady, Shannon C; Lee, Daehan; Lee, Junho; Andersen, Erik C

    2018-03-01

    Parasitic nematodes impose a debilitating health and economic burden across much of the world. Nematode resistance to anthelmintic drugs threatens parasite control efforts in both human and veterinary medicine. Despite this threat, the genetic landscape of potential resistance mechanisms to these critical drugs remains largely unexplored. Here, we exploit natural variation in the model nematodes Caenorhabditis elegans and Caenorhabditis briggsae to discover quantitative trait loci (QTL) that control sensitivity to benzimidazoles widely used in human and animal medicine. High-throughput phenotyping of albendazole, fenbendazole, mebendazole, and thiabendazole responses in panels of recombinant lines led to the discovery of over 15 QTL in C. elegans and four QTL in C. briggsae associated with divergent responses to these anthelmintics. Many of these QTL are conserved across benzimidazole derivatives, but others show drug and dose specificity. We used near-isogenic lines to recapitulate and narrow the C. elegans albendazole QTL of largest effect and identified candidate variants correlated with the resistance phenotype. These QTL do not overlap with known benzimidazole target resistance genes from parasitic nematodes and present specific new leads for the discovery of novel mechanisms of nematode benzimidazole resistance. Analyses of orthologous genes reveal conservation of candidate benzimidazole resistance genes in medically important parasitic nematodes. These data provide a basis for extending these approaches to other anthelmintic drug classes and a pathway towards validating new markers for anthelmintic resistance that can be deployed to improve parasite disease control.

  5. "Eureka, Eureka!" Discoveries in Science

    Science.gov (United States)

    Agarwal, Pankaj

    2011-01-01

    Accidental discoveries have been of significant value in the progress of science. Although accidental discoveries are more common in pharmacology and chemistry, other branches of science have also benefited from such discoveries. While most discoveries are the result of persistent research, famous accidental discoveries provide a fascinating…

  6. Anthropogenic Habitats Facilitate Dispersal of an Early Successional Obligate: Implications for Restoration of an Endangered Ecosystem.

    Directory of Open Access Journals (Sweden)

    Katrina E Amaral

    Full Text Available Landscape modification and habitat fragmentation disrupt the connectivity of natural landscapes, with major consequences for biodiversity. Species that require patchily distributed habitats, such as those that specialize on early successional ecosystems, must disperse through a landscape matrix with unsuitable habitat types. We evaluated landscape effects on dispersal of an early successional obligate, the New England cottontail (Sylvilagus transitionalis. Using a landscape genetics approach, we identified barriers and facilitators of gene flow and connectivity corridors for a population of cottontails in the northeastern United States. We modeled dispersal in relation to landscape structure and composition and tested hypotheses about the influence of habitat fragmentation on gene flow. Anthropogenic and natural shrubland habitats facilitated gene flow, while the remainder of the matrix, particularly development and forest, impeded gene flow. The relative influence of matrix habitats differed between study areas in relation to a fragmentation gradient. Barrier features had higher explanatory power in the more fragmented site, while facilitating features were important in the less fragmented site. Landscape models that included a simultaneous barrier and facilitating effect of roads had higher explanatory power than models that considered either effect separately, supporting the hypothesis that roads act as both barriers and facilitators at all spatial scales. The inclusion of LiDAR-identified shrubland habitat improved the fit of our facilitator models. Corridor analyses using circuit and least cost path approaches revealed the importance of anthropogenic, linear features for restoring connectivity between the study areas. In fragmented landscapes, human-modified habitats may enhance functional connectivity by providing suitable dispersal conduits for early successional specialists.

  7. Serious limitations of the QTL/Microarray approach for QTL gene discovery

    Directory of Open Access Journals (Sweden)

    Warden Craig H

    2010-07-01

    Full Text Available Abstract Background It has been proposed that the use of gene expression microarrays in nonrecombinant parental or congenic strains can accelerate the process of isolating individual genes underlying quantitative trait loci (QTL. However, the effectiveness of this approach has not been assessed. Results Thirty-seven studies that have implemented the QTL/microarray approach in rodents were reviewed. About 30% of studies showed enrichment for QTL candidates, mostly in comparisons between congenic and background strains. Three studies led to the identification of an underlying QTL gene. To complement the literature results, a microarray experiment was performed using three mouse congenic strains isolating the effects of at least 25 biometric QTL. Results show that genes in the congenic donor regions were preferentially selected. However, within donor regions, the distribution of differentially expressed genes was homogeneous once gene density was accounted for. Genes within identical-by-descent (IBD regions were less likely to be differentially expressed in chromosome 2, but not in chromosomes 11 and 17. Furthermore, expression of QTL regulated in cis (cis eQTL showed higher expression in the background genotype, which was partially explained by the presence of single nucleotide polymorphisms (SNP. Conclusions The literature shows limited successes from the QTL/microarray approach to identify QTL genes. Our own results from microarray profiling of three congenic strains revealed a strong tendency to select cis-eQTL over trans-eQTL. IBD regions had little effect on rate of differential expression, and we provide several reasons why IBD should not be used to discard eQTL candidates. In addition, mismatch probes produced false cis-eQTL that could not be completely removed with the current strains genotypes and low probe density microarrays. The reviewed studies did not account for lack of coverage from the platforms used and therefore removed genes

  8. 30 CFR 44.24 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Discovery. 44.24 Section 44.24 Mineral... Discovery. Parties shall be governed in their conduct of discovery by appropriate provisions of the Federal... discovery. Alternative periods of time for discovery may be prescribed by the presiding administrative law...

  9. 19 CFR 356.20 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 3 2010-04-01 2010-04-01 false Discovery. 356.20 Section 356.20 Customs Duties... § 356.20 Discovery. (a) Voluntary discovery. All parties are encouraged to engage in voluntary discovery... sanctions proceeding. (b) Limitations on discovery. The administrative law judge shall place such limits...

  10. Chemical Discovery

    Science.gov (United States)

    Brown, Herbert C.

    1974-01-01

    The role of discovery in the advance of the science of chemistry and the factors that are currently operating to handicap that function are considered. Examples are drawn from the author's work with boranes. The thesis that exploratory research and discovery should be encouraged is stressed. (DT)

  11. From General Aberrant Alternative Splicing in Cancers and Its Therapeutic Application to the Discovery of an Oncogenic DMTF1 Isoform

    Directory of Open Access Journals (Sweden)

    Na Tian

    2017-03-01

    Full Text Available Alternative pre-mRNA splicing is a crucial process that allows the generation of diversified RNA and protein products from a multi-exon gene. In tumor cells, this mechanism can facilitate cancer development and progression through both creating oncogenic isoforms and reducing the expression of normal or controllable protein species. We recently demonstrated that an alternative cyclin D-binding myb-like transcription factor 1 (DMTF1 pre-mRNA splicing isoform, DMTF1β, is increasingly expressed in breast cancer and promotes mammary tumorigenesis in a transgenic mouse model. Aberrant pre-mRNA splicing is a typical event occurring for many cancer-related functional proteins. In this review, we introduce general aberrant pre-mRNA splicing in cancers and discuss its therapeutic application using our recent discovery of the oncogenic DMTF1 isoform as an example. We also summarize new insights in designing novel targeting strategies of cancer therapies based on the understanding of deregulated pre-mRNA splicing mechanisms.

  12. De novo transcriptome assembly facilitates characterisation of fast-evolving gene families, MHC class I in the bank vole (Myodes glareolus).

    Science.gov (United States)

    Migalska, M; Sebastian, A; Konczal, M; Kotlík, P; Radwan, J

    2017-04-01

    The major histocompatibility complex (MHC) plays a central role in the adaptive immune response and is the most polymorphic gene family in vertebrates. Although high-throughput sequencing has increasingly been used for genotyping families of co-amplifying MHC genes, its potential to facilitate early steps in the characterisation of MHC variation in nonmodel organism has not been fully explored. In this study we evaluated the usefulness of de novo transcriptome assembly in characterisation of MHC sequence diversity. We found that although de novo transcriptome assembly of MHC I genes does not reconstruct sequences of individual alleles, it does allow the identification of conserved regions for PCR primer design. Using the newly designed primers, we characterised MHC I sequences in the bank vole. Phylogenetic analysis of the partial MHC I coding sequence (2-4 exons) of the bank vole revealed a lack of orthology to MHC I of other Cricetidae, consistent with the high gene turnover of this region. The diversity of expressed alleles was characterised using ultra-deep sequencing of the third exon that codes for the peptide-binding region of the MHC molecule. High allelic diversity was demonstrated, with 72 alleles found in 29 individuals. Interindividual variation in the number of expressed loci was found, with the number of alleles per individual ranging from 5 to 14. Strong signatures of positive selection were found for 8 amino acid sites, most of which are inferred to bind antigens in human MHC, indicating conservation of structure despite rapid sequence evolution.

  13. Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery

    Science.gov (United States)

    Guingab-Cagmat, J.D.; Cagmat, E.B.; Hayes, R.L.; Anagli, J.

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  14. Sub-inhibitory concentrations of heavy metals facilitate the horizontal transfer of plasmid-mediated antibiotic resistance genes in water environment.

    Science.gov (United States)

    Zhang, Ye; Gu, April Z; Cen, Tianyu; Li, Xiangyang; He, Miao; Li, Dan; Chen, Jianmin

    2018-06-01

    Although widespread antibiotic resistance has been mostly attributed to the selective pressure generated by overuse and misuse of antibiotics, recent growing evidence suggests that chemicals other than antibiotics, such as certain metals, can also select and stimulate antibiotic resistance via both co-resistance and cross-resistance mechanisms. For instance, tetL, merE, and oprD genes are resistant to both antibiotics and metals. However, the potential de novo resistance induced by heavy metals at environmentally-relevant low concentrations (much below theminimum inhibitory concentrations [MICs], also referred as sub-inhibitory) has hardly been explored. This study investigated and revealed that heavy metals, namely Cu(II), Ag(I), Cr(VI), and Zn(II), at environmentally-relevant and sub-inhibitory concentrations, promoted conjugative transfer of antibiotic resistance genes (ARGs) between E. coli strains. The mechanisms of this phenomenon were further explored, which involved intracellular reactive oxygen species (ROS) formation, SOS response, increased cell membrane permeability, and altered expression of conjugation-relevant genes. These findings suggest that sub-inhibitory levels of heavy metals that widely present in various environments contribute to the resistance phenomena via facilitating horizontal transfer of ARGs. This study provides evidence from multiple aspects implicating the ecological effect of low levels of heavy metals on antibiotic resistance dissemination and highlights the urgency of strengthening efficacious policy and technology to control metal pollutants in the environments. Copyright © 2018 Elsevier Ltd. All rights reserved.

  15. 24 CFR 180.500 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Discovery. 180.500 Section 180.500... OPPORTUNITY CONSOLIDATED HUD HEARING PROCEDURES FOR CIVIL RIGHTS MATTERS Discovery § 180.500 Discovery. (a) In general. This subpart governs discovery in aid of administrative proceedings under this part. Discovery in...

  16. 22 CFR 224.21 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Discovery. 224.21 Section 224.21 Foreign....21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of... parties, discovery is available only as ordered by the ALJ. The ALJ shall regulate the timing of discovery...

  17. Drug target ontology to classify and integrate drug discovery data.

    Science.gov (United States)

    Lin, Yu; Mehta, Saurabh; Küçük-McGinty, Hande; Turner, John Paul; Vidovic, Dusica; Forlin, Michele; Koleti, Amar; Nguyen, Dac-Trung; Jensen, Lars Juhl; Guha, Rajarshi; Mathias, Stephen L; Ursu, Oleg; Stathias, Vasileios; Duan, Jianbin; Nabizadeh, Nooshin; Chung, Caty; Mader, Christopher; Visser, Ubbo; Yang, Jeremy J; Bologa, Cristian G; Oprea, Tudor I; Schürer, Stephan C

    2017-11-09

    model for druggable targets including various related information such as protein, gene, protein domain, protein structure, binding site, small molecule drug, mechanism of action, protein tissue localization, disease association, and many other types of information. DTO will further facilitate the otherwise challenging integration and formal linking to biological assays, phenotypes, disease models, drug poly-pharmacology, binding kinetics and many other processes, functions and qualities that are at the core of drug discovery. The first version of DTO is publically available via the website http://drugtargetontology.org/ , Github ( http://github.com/DrugTargetOntology/DTO ), and the NCBO Bioportal ( http://bioportal.bioontology.org/ontologies/DTO ). The long-term goal of DTO is to provide such an integrative framework and to populate the ontology with this information as a community resource.

  18. Fragment-based approaches to the discovery of kinase inhibitors.

    Science.gov (United States)

    Mortenson, Paul N; Berdini, Valerio; O'Reilly, Marc

    2014-01-01

    Protein kinases are one of the most important families of drug targets, and aberrant kinase activity has been linked to a large number of disease areas. Although eminently targetable using small molecules, kinases present a number of challenges as drug targets, not least obtaining selectivity across such a large and relatively closely related target family. Fragment-based drug discovery involves screening simple, low-molecular weight compounds to generate initial hits against a target. These hits are then optimized to more potent compounds via medicinal chemistry, usually facilitated by structural biology. Here, we will present a number of recent examples of fragment-based approaches to the discovery of kinase inhibitors, detailing the construction of fragment-screening libraries, the identification and validation of fragment hits, and their optimization into potent and selective lead compounds. The advantages of fragment-based methodologies will be discussed, along with some of the challenges associated with using this route. Finally, we will present a number of key lessons derived both from our own experience running fragment screens against kinases and from a large number of published studies.

  19. The Energy Industry Profile of ISO/DIS 19115-1: Facilitating Discovery and Evaluation of, and Access to Distributed Information Resources

    Science.gov (United States)

    Hills, S. J.; Richard, S. M.; Doniger, A.; Danko, D. M.; Derenthal, L.; Energistics Metadata Work Group

    2011-12-01

    established, capability-rich, open standard for geographic metadata, EIP v1 is expected to be widely acceptable within the community and readily sustainable over the long-term. The EIP design, also per community requirements, will enable discovery, evaluation, and access to types of information resources considered important to the community, including structured and unstructured digital resources, and physical assets such as hardcopy documents and material samples. This presentation will briefly review the development of this initiative as well as the current and planned Work Group activities. More time will be spent providing an overview of the EIP v1, including the requirements it prescribes, design efforts made to enable automated metadata capture and processing, and the structure and content of its documentation, which was written to minimize ambiguity and facilitate implementation. The Work Group considers EIP v1 a solid initial design for interoperable metadata, and first step toward the vision of the Initiative.

  20. 19 CFR 207.109 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 3 2010-04-01 2010-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...

  1. 15 CFR 25.21 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Discovery. 25.21 Section 25.21... Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for..., discovery is available only as ordered by the ALJ. The ALJ shall regulate the timing of discovery. (d...

  2. Predicting Causal Relationships from Biological Data: Applying Automated Casual Discovery on Mass Cytometry Data of Human Immune Cells

    KAUST Repository

    Triantafillou, Sofia; Lagani, Vincenzo; Heinze-Deml, Christina; Schmidt, Angelika; Tegner, Jesper; Tsamardinos, Ioannis

    2017-01-01

    Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

  3. Predicting Causal Relationships from Biological Data: Applying Automated Casual Discovery on Mass Cytometry Data of Human Immune Cells

    KAUST Repository

    Triantafillou, Sofia

    2017-03-31

    Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

  4. Predicting Causal Relationships from Biological Data: Applying Automated Causal Discovery on Mass Cytometry Data of Human Immune Cells

    KAUST Repository

    Triantafillou, Sofia; Lagani, Vincenzo; Heinze-Deml, Christina; Schmidt, Angelika; Tegner, Jesper; Tsamardinos, Ioannis

    2017-01-01

    Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated  causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

  5. Predicting Causal Relationships from Biological Data: Applying Automated Causal Discovery on Mass Cytometry Data of Human Immune Cells

    KAUST Repository

    Triantafillou, Sofia

    2017-09-29

    Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated  causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

  6. CUAHSI-HIS: an Internet based system to facilitate public discovery, access, and exploration of different water science data sources

    Science.gov (United States)

    Arrigo, J. S.; Hooper, R. P.; Choi, Y.; Ames, D. P.; Kadlec, J.; Whiteaker, T.

    2011-12-01

    "Water is everywhere." This sentiment underscores the importance of instilling hydrologic and earth science literacy in educators, students, and the general public, but also presents challenges for water scientists and educators. Scientific data about water is collected and distributed by several different sources, from federal agencies to scientific investigators to citizen scientists. As competition for limited water resources increase, increasing access to and understanding of the wealth of information about the nation's and the world's water will be critical. The CUAHSI-HIS system is a web based system for sharing hydrologic data that can help address this need. HydroDesktop is a free, open source application for finding, getting, analyzing and using hydrologic data from the CUAHSI-HIS system. It works with HydroCatalog which indexes the data to find out what data exists and where it is, and then it retrieves the data from HydroServers where it is stored communicating using WaterOneFlow web services. Currently, there are over 65 services registered in HydroCatalog providing central discovery of water data from several federal and state agencies, university projects, and other sources. HydroDesktop provides a simplified GIS that allows users to incorporate spatial data, and simple analysis tools to facilitate graphing and visualization. HydroDesktop is designed to be useful for a number of different groups of users with a wide variety of needs and skill levels including university faculty, graduate and undergraduate students, K-12 students, engineering and scientific consultants, and others. This presentation will highlight some of the features of HydroDesktop and the CUAHSI-HIS system that make it particularly appropriate for use in educational and public outreach settings, and will present examples of educational use. The incorporation of "real data," localization to an area of interest, and problem-based learning are all recognized as effective strategies for

  7. [Discovery of the target genes inhibited by formic acid in Candida shehatae].

    Science.gov (United States)

    Cai, Peng; Xiong, Xujie; Xu, Yong; Yong, Qiang; Zhu, Junjun; Shiyuan, Yu

    2014-01-04

    At transcriptional level, the inhibitory effects of formic acid was investigated on Candida shehatae, a model yeast strain capable of fermenting xylose to ethanol. Thereby, the target genes were regulated by formic acid and the transcript profiles were discovered. On the basis of the transcriptome data of C. shehatae metabolizing glucose and xylose, the genes responsible for ethanol fermentation were chosen as candidates by the combined method of yeast metabolic pathway analysis and manual gene BLAST search. These candidates were then quantitatively detected by RQ-PCR technique to find the regulating genes under gradient doses of formic acid. By quantitative analysis of 42 candidate genes, we finally identified 10 and 5 genes as markedly down-regulated and up-regulated targets by formic acid, respectively. With regard to gene transcripts regulated by formic acid in C. shehatae, the markedly down-regulated genes ranking declines as follows: xylitol dehydrogenase (XYL2), acetyl-CoA synthetase (ACS), ribose-5-phosphate isomerase (RKI), transaldolase (TAL), phosphogluconate dehydrogenase (GND1), transketolase (TKL), glucose-6-phosphate dehydrogenase (ZWF1), xylose reductase (XYL1), pyruvate dehydrogenase (PDH) and pyruvate decarboxylase (PDC); and a declining rank for up-regulated gens as follows: fructose-bisphosphate aldolase (ALD), glucokinase (GLK), malate dehydrogenase (MDH), 6-phosphofructokinase (PFK) and alcohol dehydrogenase (ADH).

  8. 39 CFR 963.14 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 39 Postal Service 1 2010-07-01 2010-07-01 false Discovery. 963.14 Section 963.14 Postal Service... PANDERING ADVERTISEMENTS STATUTE, 39 U.S.C. 3008 § 963.14 Discovery. Discovery is to be conducted on a... such discovery as he or she deems reasonable and necessary. Discovery may include one or more of the...

  9. In silico discovery of transcription regulatory elements in Plasmodium falciparum

    Directory of Open Access Journals (Sweden)

    Le Roch Karine G

    2008-02-01

    Full Text Available Abstract Background With the sequence of the Plasmodium falciparum genome and several global mRNA and protein life cycle expression profiling projects now completed, elucidating the underlying networks of transcriptional control important for the progression of the parasite life cycle is highly pertinent to the development of new anti-malarials. To date, relatively little is known regarding the specific mechanisms the parasite employs to regulate gene expression at the mRNA level, with studies of the P. falciparum genome sequence having revealed few cis-regulatory elements and associated transcription factors. Although it is possible the parasite may evoke mechanisms of transcriptional control drastically different from those used by other eukaryotic organisms, the extreme AT-rich nature of P. falciparum intergenic regions (~90% AT presents significant challenges to in silico cis-regulatory element discovery. Results We have developed an algorithm called Gene Enrichment Motif Searching (GEMS that uses a hypergeometric-based scoring function and a position-weight matrix optimization routine to identify with high-confidence regulatory elements in the nucleotide-biased and repeat sequence-rich P. falciparum genome. When applied to promoter regions of genes contained within 21 co-expression gene clusters generated from P. falciparum life cycle microarray data using the semi-supervised clustering algorithm Ontology-based Pattern Identification, GEMS identified 34 putative cis-regulatory elements associated with a variety of parasite processes including sexual development, cell invasion, antigenic variation and protein biosynthesis. Among these candidates were novel motifs, as well as many of the elements for which biological experimental evidence already exists in the Plasmodium literature. To provide evidence for the biological relevance of a cell invasion-related element predicted by GEMS, reporter gene and electrophoretic mobility shift assays

  10. Transferrin-facilitated lipofection gene delivery strategy: characterization of the transfection complexes and intracellular trafficking.

    Science.gov (United States)

    Joshee, Nirmal; Bastola, Dhundy R; Cheng, Pi-Wan

    2002-11-01

    We previously showed that mixing transferrin with a cationic liposome prior to the addition of DNA, greatly enhanced the lipofection efficiency. Here, we report characterization of the transfection complexes in formulations prepared with transferrin, lipofectin, and DNA (pCMVlacZ) in various formulations. DNA in all the formulations that contain lipofectin was resistant to DNase I treatment. Transfection experiments performed in Panc 1 cells showed that the standard formulation, which was prepared by adding DNA to a mixture of transferrin and lipofectin, yielded highest transfection efficiency. There was no apparent difference in zeta potential among these formulations, but the most efficient formulation contained complexes with a mean diameter of three to four times that of liposome and the complexes in other gene delivery formulations. Transmission electron microscopic examination of the standard transfection complexes formulated using gold-labeled transferrin showed extended circular DNA decorated with transferrin as compared to extensively condensed DNA found in lipofectin-DNA complexes and heterogeneous structures in other formulations. By confocal microscopy, DNA and transferrin were found to colocalize at the perinuclear space and in the nucleus, suggesting cotransportation intracellularly, including nuclear transport. We propose that transferrin enhances the transfection efficiency of the standard lipofection formulation by preventing DNA condensation, and facilitating endocytosis and nuclear targeting.

  11. Novel approaches to develop community-built biological network models for potential drug discovery.

    Science.gov (United States)

    Talikka, Marja; Bukharov, Natalia; Hayes, William S; Hofmann-Apitius, Martin; Alexopoulos, Leonidas; Peitsch, Manuel C; Hoeng, Julia

    2017-08-01

    Hundreds of thousands of data points are now routinely generated in clinical trials by molecular profiling and NGS technologies. A true translation of this data into knowledge is not possible without analysis and interpretation in a well-defined biology context. Currently, there are many public and commercial pathway tools and network models that can facilitate such analysis. At the same time, insights and knowledge that can be gained is highly dependent on the underlying biological content of these resources. Crowdsourcing can be employed to guarantee the accuracy and transparency of the biological content underlining the tools used to interpret rich molecular data. Areas covered: In this review, the authors describe crowdsourcing in drug discovery. The focal point is the efforts that have successfully used the crowdsourcing approach to verify and augment pathway tools and biological network models. Technologies that enable the building of biological networks with the community are also described. Expert opinion: A crowd of experts can be leveraged for the entire development process of biological network models, from ontologies to the evaluation of their mechanistic completeness. The ultimate goal is to facilitate biomarker discovery and personalized medicine by mechanistically explaining patients' differences with respect to disease prevention, diagnosis, and therapy outcome.

  12. High-throughput platform assay technology for the discovery of pre-microrna-selective small molecule probes.

    Science.gov (United States)

    Lorenz, Daniel A; Song, James M; Garner, Amanda L

    2015-01-21

    MicroRNAs (miRNA) play critical roles in human development and disease. As such, the targeting of miRNAs is considered attractive as a novel therapeutic strategy. A major bottleneck toward this goal, however, has been the identification of small molecule probes that are specific for select RNAs and methods that will facilitate such discovery efforts. Using pre-microRNAs as proof-of-concept, herein we report a conceptually new and innovative approach for assaying RNA-small molecule interactions. Through this platform assay technology, which we term catalytic enzyme-linked click chemistry assay or cat-ELCCA, we have designed a method that can be implemented in high throughput, is virtually free of false readouts, and is general for all nucleic acids. Through cat-ELCCA, we envision the discovery of selective small molecule ligands for disease-relevant miRNAs to promote the field of RNA-targeted drug discovery and further our understanding of the role of miRNAs in cellular biology.

  13. The web server of IBM's Bioinformatics and Pattern Discovery group: 2004 update.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore

    2004-07-01

    In this report, we provide an update on the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server, which is operational around the clock, provides access to a large number of methods that have been developed and published by the group's members. There is an increasing number of problems that these tools can help tackle; these problems range from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences, the identification--directly from sequence--of structural deviations from alpha-helicity and the annotation of amino acid sequences for antimicrobial activity. Additionally, annotations for more than 130 archaeal, bacterial, eukaryotic and viral genomes are now available on-line and can be searched interactively. The tools and code bundles continue to be accessible from http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  14. SNP-PHAGE – High throughput SNP discovery pipeline

    Directory of Open Access Journals (Sweden)

    Cregan Perry B

    2006-10-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs, amplified fragment length polymorphisms (AFLPs and simple sequence repeats (SSRs or microsatellite markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. Results We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis and GenBank (-dbSNP submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at http://bfgl.anri.barc.usda.gov/ML/snp-phage/. Conclusion SNP-PHAGE provides a bioinformatics

  15. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    Directory of Open Access Journals (Sweden)

    Robertson Gordon

    2010-10-01

    Full Text Available Abstract Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE. Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.

  16. Dissecting the Contributions of Cooperating Gene Mutations to Cancer Phenotypes and Drug Responses with Patient-Derived iPSCs

    Directory of Open Access Journals (Sweden)

    Chan-Jung Chang

    2018-05-01

    Full Text Available Summary: Connecting specific cancer genotypes with phenotypes and drug responses constitutes the central premise of precision oncology but is hindered by the genetic complexity and heterogeneity of primary cancer cells. Here, we use patient-derived induced pluripotent stem cells (iPSCs and CRISPR/Cas9 genome editing to dissect the individual contributions of two recurrent genetic lesions, the splicing factor SRSF2 P95L mutation and the chromosome 7q deletion, to the development of myeloid malignancy. Using a comprehensive panel of isogenic iPSCs—with none, one, or both genetic lesions—we characterize their relative phenotypic contributions and identify drug sensitivities specific to each one through a candidate drug approach and an unbiased large-scale small-molecule screen. To facilitate drug testing and discovery, we also derive SRSF2-mutant and isogenic normal expandable hematopoietic progenitor cells. We thus describe here an approach to dissect the individual effects of two cooperating mutations to clinically relevant features of malignant diseases. : Papapetrou and colleagues develop a comprehensive panel of isogenic iPSC lines with SRSF2 P95L mutation and chr7q deletion. They use these cells to identify cellular phenotypes contributed by each genetic lesion and therapeutic vulnerabilities specific to each one and develop expandable hematopoietic progenitor cell lines to facilitate drug discovery. Keywords: induced pluripotent stem cells, myelodysplastic syndrome, CRISPR/Cas9, gene editing, mutational cooperation, splicing factor mutations, spliceosomal mutations, SRSF2, chr7q deletion

  17. Emerging techniques for the discovery and validation of therapeutic targets for skeletal diseases.

    Science.gov (United States)

    Cho, Christine H; Nuttall, Mark E

    2002-12-01

    Advances in genomics and proteomics have revolutionised the drug discovery process and target validation. Identification of novel therapeutic targets for chronic skeletal diseases is an extremely challenging process based on the difficulty of obtaining high-quality human diseased versus normal tissue samples. The quality of tissue and genomic information obtained from the sample is critical to identifying disease-related genes. Using a genomics-based approach, novel genes or genes with similar homology to existing genes can be identified from cDNA libraries generated from normal versus diseased tissue. High-quality cDNA libraries are prepared from uncontaminated homogeneous cell populations harvested from tissue sections of interest. Localised gene expression analysis and confirmation are obtained through in situ hybridisation or immunohistochemical studies. Cells overexpressing the recombinant protein are subsequently designed for primary cell-based high-throughput assays that are capable of screening large compound banks for potential hits. Afterwards, secondary functional assays are used to test promising compounds. The same overexpressing cells are used in the secondary assay to test protein activity and functionality as well as screen for small-molecule agonists or antagonists. Once a hit is generated, a structure-activity relationship of the compound is optimised for better oral bioavailability and pharmacokinetics allowing the compound to progress into development. Parallel efforts from proteomics, as well as genetics/transgenics, bioinformatics and combinatorial chemistry, and improvements in high-throughput automation technologies, allow the drug discovery process to meet the demands of the medicinal market. This review discusses and illustrates how different approaches are incorporated into the discovery and validation of novel targets and, consequently, the development of potentially therapeutic agents in the areas of osteoporosis and osteoarthritis

  18. Concept Formation in Scientific Knowledge Discovery from a Constructivist View

    Science.gov (United States)

    Peng, Wei; Gero, John S.

    The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called "first-person" knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer's first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing

  19. Cardiac-Specific Gene Expression Facilitated by an Enhanced Myosin Light Chain Promoter

    Directory of Open Access Journals (Sweden)

    Wolfgang Boecker

    2004-04-01

    Full Text Available Background: Adenoviral gene transfer has been shown to be effective in cardiac myocytes in vitro and in vivo. A major limitation of myocardial gene therapy is the extracardiac transgene expression. Methods: To minimize extracardiac gene expression, we have constructed a tissue-specific promoter for cardiac gene transfer, namely, the 250-bp fragment of the myosin light chain-2v (MLC-2v gene, which is known to be expressed in a tissue-specific manner in ventricular myocardium followed by a luciferase (luc reporter gene (Ad.4 × MLC250.Luc. Rat cardiomyocytes, liver and kidney cells were infected with Ad.4 × MLC.Luc or control vectors. For in vivo testing, Ad.4 × MLC250.Luc was injected into the myocardium or in the liver of rats. Kinetics of promoter activity were monitored over 8 days using a cooled CCD camera. Results: In vitro: By infecting hepatic versus cardiomyocyte cells, we found that the promoter specificity ratio (luc activity in cardiomyocytes per liver cells was 20.4 versus 0.9 (Ad.4 × MLC250.Luc vs. Ad.CMV. In vivo: Ad.4 × MLC250.Luc significantly reduced luc activity in liver (38.4-fold, lung (16.1-fold, and kidney (21.8-fold versus Ad.CMV (p = .01; whereas activity in the heart was only 3.8-fold decreased. The gene expression rate of cardiomyocytes versus hepatocytes was 7:1 (Ad.4 × MLC.Luc versus 1:1.4 (Ad.CMV.Luc. Discussion: This new vector may be useful to validate therapeutic approaches in animal disease models and offers the perspective for selective expression of therapeutic genes in the diseased heart.

  20. A role for physicians in ethnopharmacology and drug discovery.

    Science.gov (United States)

    Raza, Mohsin

    2006-04-06

    Ethnopharmacology investigations classically involved traditional healers, botanists, anthropologists, chemists and pharmacologists. The role of some groups of researchers but not of physician has been highlighted and well defined in ethnopharmacological investigations. Historical data shows that discovery of several important modern drugs of herbal origin owe to the medical knowledge and clinical expertise of physicians. Current trends indicate negligible role of physicians in ethnopharmacological studies. Rising cost of modern drug development is attributed to the lack of classical ethnopharmacological approach. Physicians can play multiple roles in the ethnopharmacological studies to facilitate drug discovery as well as to rescue authentic traditional knowledge of use of medicinal plants. These include: (1) Ethnopharmacological field work which involves interviewing healers, interpreting traditional terminologies into their modern counterparts, examining patients consuming herbal remedies and identifying the disease for which an herbal remedy is used. (2) Interpretation of signs and symptoms mentioned in ancient texts and suggesting proper use of old traditional remedies in the light of modern medicine. (3) Clinical studies on herbs and their interaction with modern medicines. (4) Advising pharmacologists to carryout laboratory studies on herbs observed during field studies. (5) Work in collaboration with local healers to strengthen traditional system of medicine in a community. In conclusion, physician's involvement in ethnopharmacological studies will lead to more reliable information on traditional use of medicinal plants both from field and ancient texts, more focused and cheaper natural product based drug discovery, as well as bridge the gap between traditional and modern medicine.

  1. Large-scale discovery of promoter motifs in Drosophila melanogaster.

    Directory of Open Access Journals (Sweden)

    Thomas A Down

    2007-01-01

    Full Text Available A key step in understanding gene regulation is to identify the repertoire of transcription factor binding motifs (TFBMs that form the building blocks of promoters and other regulatory elements. Identifying these experimentally is very laborious, and the number of TFBMs discovered remains relatively small, especially when compared with the hundreds of transcription factor genes predicted in metazoan genomes. We have used a recently developed statistical motif discovery approach, NestedMICA, to detect candidate TFBMs from a large set of Drosophila melanogaster promoter regions. Of the 120 motifs inferred in our initial analysis, 25 were statistically significant matches to previously reported motifs, while 87 appeared to be novel. Analysis of sequence conservation and motif positioning suggested that the great majority of these discovered motifs are predictive of functional elements in the genome. Many motifs showed associations with specific patterns of gene expression in the D. melanogaster embryo, and we were able to obtain confident annotation of expression patterns for 25 of our motifs, including eight of the novel motifs. The motifs are available through Tiffin, a new database of DNA sequence motifs. We have discovered many new motifs that are overrepresented in D. melanogaster promoter regions, and offer several independent lines of evidence that these are novel TFBMs. Our motif dictionary provides a solid foundation for further investigation of regulatory elements in Drosophila, and demonstrates techniques that should be applicable in other species. We suggest that further improvements in computational motif discovery should narrow the gap between the set of known motifs and the total number of transcription factors in metazoan genomes.

  2. Pharmacogenetics in type 2 diabetes: precision medicine or discovery tool?

    Science.gov (United States)

    Florez, Jose C

    2017-05-01

    In recent years, technological and analytical advances have led to an explosion in the discovery of genetic loci associated with type 2 diabetes. However, their ability to improve prediction of disease outcomes beyond standard clinical risk factors has been limited. On the other hand, genetic effects on drug response may be stronger than those commonly seen for disease incidence. Pharmacogenetic findings may aid in identifying new drug targets, elucidate pathophysiology, unravel disease heterogeneity, help prioritise specific genes in regions of genetic association, and contribute to personalised or precision treatment. In diabetes, precedent for the successful application of pharmacogenetic concepts exists in its monogenic subtypes, such as MODY or neonatal diabetes. Whether similar insights will emerge for the much more common entity of type 2 diabetes remains to be seen. As genetic approaches advance, the progressive deployment of candidate gene, large-scale genotyping and genome-wide association studies has begun to produce suggestive results that may transform clinical practice. However, many barriers to the translation of diabetes pharmacogenetic discoveries to the clinic still remain. This perspective offers a contemporary overview of the field with a focus on sulfonylureas and metformin, identifies the major uses of pharmacogenetics, and highlights potential limitations and future directions.

  3. Strategies for Discovery of Small Molecule Radiation Protectors and Radiation Mitigators

    Directory of Open Access Journals (Sweden)

    Joel S Greenberger

    2012-01-01

    Full Text Available Mitochondrial targeted radiation damage protectors (delivered prior to irradiation and mitigators (delivered after irradiation, but before the appearance of symptoms associated with radiation syndrome have been a recent focus in drug discovery for 1 normal tissue radiation protection during fractionated radiotherapy, and 2 radiation terrorism counter measures. Several categories of such molecules have been discovered: nitroxide-linked hybrid molecules, including GS-nitroxide, GS-nitric oxide synthase inhibitors, p53/mdm2/mdm4 inhibitors, and pharmaceutical agents including inhibitors of the phosphoinositide-3-kinase pathway and the anti-seizure medicine, carbamazepine. Evaluation of potential new irradiation dose modifying molecules to protect normal tissue includes: clonagenic radiation survival curves; assays for apoptosis and DNA repair, and irradiation-induced depletion of antioxidant stores. Studies of organ specific radioprotection and in total body irradiation-induced hematopoietic syndrome in the mouse model for protection/mitigation facilitate rational means by which to move candidate small molecule drugs along the drug discovery pipeline into clinical development.

  4. Strategies for Discovery of Small Molecule Radiation Protectors and Radiation Mitigators

    Energy Technology Data Exchange (ETDEWEB)

    Greenberger, Joel S.; Clump, David [Radiation Oncology Department, University of Pittsburgh Cancer Institute, Pittsburgh, PA (United States); Kagan, Valerian [Environmental and Occupational Health Department, University of Pittsburgh, Pittsburgh, PA (United States); Bayir, Hülya [Critical Care Medicine Department, University of Pittsburgh Medical Center, Pittsburgh, PA (United States); Lazo, John S. [Pharmacology Department, University of Virginia, Charlottesville, VA (United States); Wipf, Peter [Department of Chemistry, Accelerated Chemical Discovery Center, University of Pittsburgh, Pittsburgh, PA (United States); Li, Song; Gao, Xiang [Pharmaceutical Science Department, University of Pittsburgh, Pittsburgh, PA (United States); Epperly, Michael W., E-mail: greenbergerjs@upmc.edu [Radiation Oncology Department, University of Pittsburgh Cancer Institute, Pittsburgh, PA (United States)

    2012-01-13

    Mitochondrial targeted radiation damage protectors (delivered prior to irradiation) and mitigators (delivered after irradiation, but before the appearance of symptoms associated with radiation syndrome) have been a recent focus in drug discovery for (1) normal tissue radiation protection during fractionated radiotherapy, and (2) radiation terrorism counter measures. Several categories of such molecules have been discovered: nitroxide-linked hybrid molecules, including GS-nitroxide, GS-nitric oxide synthase inhibitors, p53/mdm2/mdm4 inhibitors, and pharmaceutical agents including inhibitors of the phosphoinositide-3-kinase pathway and the anti-seizure medicine, carbamazepine. Evaluation of potential new radiation dose modifying molecules to protect normal tissue includes: clonogenic radiation survival curves, assays for apoptosis and DNA repair, and irradiation-induced depletion of antioxidant stores. Studies of organ specific radioprotection and in total body irradiation-induced hematopoietic syndrome in the mouse model for protection/mitigation facilitate rational means by which to move candidate small molecule drugs along the drug discovery pipeline into clinical development.

  5. Preparative Scale Resolution of Enantiomers Enables Accelerated Drug Discovery and Development

    Directory of Open Access Journals (Sweden)

    Hanna Leek

    2017-01-01

    Full Text Available The provision of pure enantiomers is of increasing importance not only for the pharmaceutical industry but also for agro-chemistry and biotechnology. In drug discovery and development, the enantiomers of a chiral drug depict unique chemical and pharmacological behaviors in a chiral environment, such as the human body, in which the stereochemistry of the chiral drugs determines their pharmacokinetic, pharmacodynamic and toxicological properties. We present a number of challenging case studies of up-to-kilogram separations of racemic or enriched isomer mixtures using preparative liquid chromatography and super critical fluid chromatography to generate individual enantiomers that have enabled the development of new candidate drugs within AstraZeneca. The combination of chromatography and racemization as well as strategies on when to apply preparative chiral chromatography of enantiomers in a multi-step synthesis of a drug compound can further facilitate accelerated drug discovery and the early clinical evaluation of the drug candidates.

  6. Network-Guided Key Gene Discovery for a Given Cellular Process

    DEFF Research Database (Denmark)

    He, Feng Q; Ollert, Markus

    2018-01-01

    Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...

  7. In silico method for modelling metabolism and gene product expression at genome scale

    Energy Technology Data Exchange (ETDEWEB)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem; Portnoy, Vasiliy A.; Lewis, Nathan E.; Orth, Jeffrey D.; Rutledge, Alexandra C.; Smith, Richard D.; Adkins, Joshua N.; Zengler, Karsten; Palsson, Bernard O.

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome and transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.

  8. First discovery of two polyketide synthase genes for mitorubrinic acid and mitorubrinol yellow pigment biosynthesis and implications in virulence of Penicillium marneffei.

    Directory of Open Access Journals (Sweden)

    Patrick C Y Woo

    Full Text Available BACKGROUND: The genome of P. marneffei, the most important thermal dimorphic fungus causing respiratory, skin and systemic mycosis in China and Southeast Asia, possesses 23 polyketide synthase (PKS genes and 2 polyketide synthase nonribosomal peptide synthase hybrid (PKS-NRPS genes, which is of high diversity compared to other thermal dimorphic pathogenic fungi. We hypothesized that the yellow pigment in the mold form of P. marneffei could also be synthesized by one or more PKS genes. METHODOLOGY/PRINCIPAL FINDINGS: All 23 PKS and 2 PKS-NRPS genes of P. marneffei were systematically knocked down. A loss of the yellow pigment was observed in the mold form of the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants. Sequence analysis showed that PKS11 and PKS12 are fungal non-reducing PKSs. Ultra high performance liquid chromatography-photodiode array detector/electrospray ionization-quadruple time of flight-mass spectrometry (MS and MS/MS analysis of the culture filtrates of wild type P. marneffei and the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants showed that the yellow pigment is composed of mitorubrinic acid and mitorubrinol. The survival of mice challenged with the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants was significantly better than those challenged with wild type P. marneffei (P<0.05. There was also statistically significant decrease in survival of pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants compared to wild type P. marneffei in both J774 and THP1 macrophages (P<0.05. CONCLUSIONS/SIGNIFICANCE: The yellow pigment of the mold form of P. marneffei is composed of mitorubrinol and mitorubrinic acid. This represents the first discovery of PKS genes responsible for mitorubrinol and mitorubrinic acid biosynthesis. pks12 and pks11 are probably responsible for sequential use in the biosynthesis of mitorubrinol and mitorubrinic acid

  9. 19 CFR 354.10 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 19 Customs Duties 3 2010-04-01 2010-04-01 false Discovery. 354.10 Section 354.10 Customs Duties... ANTIDUMPING OR COUNTERVAILING DUTY ADMINISTRATIVE PROTECTIVE ORDER § 354.10 Discovery. (a) Voluntary discovery. All parties are encouraged to engage in voluntary discovery procedures regarding any matter, not...

  10. 36 CFR 1150.63 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false Discovery. 1150.63 Section... PRACTICE AND PROCEDURES FOR COMPLIANCE HEARINGS Prehearing Conferences and Discovery § 1150.63 Discovery. (a) Parties are encouraged to engage in voluntary discovery procedures. For good cause shown under...

  11. 37 CFR 11.52 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Discovery. 11.52 Section 11... Disciplinary Proceedings; Jurisdiction, Sanctions, Investigations, and Proceedings § 11.52 Discovery. Discovery... establishes that discovery is reasonable and relevant, the hearing officer, under such conditions as he or she...

  12. Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers.

    Science.gov (United States)

    Labaj, Wojciech; Papiez, Anna; Polanski, Andrzej; Polanska, Joanna

    2017-03-01

    Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.

  13. Providing data science support for systems pharmacology and its implications to drug discovery.

    Science.gov (United States)

    Hart, Thomas; Xie, Lei

    2016-01-01

    The conventional one-drug-one-target-one-disease drug discovery process has been less successful in tracking multi-genic, multi-faceted complex diseases. Systems pharmacology has emerged as a new discipline to tackle the current challenges in drug discovery. The goal of systems pharmacology is to transform huge, heterogeneous, and dynamic biological and clinical data into interpretable and actionable mechanistic models for decision making in drug discovery and patient treatment. Thus, big data technology and data science will play an essential role in systems pharmacology. This paper critically reviews the impact of three fundamental concepts of data science on systems pharmacology: similarity inference, overfitting avoidance, and disentangling causality from correlation. The authors then discuss recent advances and future directions in applying the three concepts of data science to drug discovery, with a focus on proteome-wide context-specific quantitative drug target deconvolution and personalized adverse drug reaction prediction. Data science will facilitate reducing the complexity of systems pharmacology modeling, detecting hidden correlations between complex data sets, and distinguishing causation from correlation. The power of data science can only be fully realized when integrated with mechanism-based multi-scale modeling that explicitly takes into account the hierarchical organization of biological systems from nucleic acid to proteins, to molecular interaction networks, to cells, to tissues, to patients, and to populations.

  14. 14 CFR 16.213 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...

  15. 28 CFR 76.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 2 2010-07-01 2010-07-01 false Discovery. 76.21 Section 76.21 Judicial... POSSESSION OF CERTAIN CONTROLLED SUBSTANCES § 76.21 Discovery. (a) Scope. Discovery under this part covers... as a general guide for discovery practices in proceedings before the Judge. However, unless otherwise...

  16. Facilitation as a teaching strategy : experiences of facilitators

    Directory of Open Access Journals (Sweden)

    E Lekalakala-Mokgele

    2006-09-01

    Full Text Available Changes in nursing education involve the move from traditional teaching approaches that are teacher-centred to facilitation, a student centred approach. The studentcentred approach is based on a philosophy of teaching and learning that puts the learner on centre-stage. The aim of this study was to identify the challenges of facilitators of learning using facilitation as a teaching method and recommend strategies for their (facilitators development and support. A qualitative, explorative and contextual design was used. Four (4 universities in South Africa which utilize facilitation as a teaching/ learning process were identified and the facilitators were selected to be the sample of the study. The main question posed during in-depth group interviews was: How do you experience facilitation as a teaching/learning method?. Facilitators indicated different experiences and emotions when they first had to facilitate learning. All of them indicated that it was difficult to facilitate at the beginning as they were trained to lecture and that no format for facilitation was available. They experienced frustrations and anxieties as a result. The lack of knowledge of facilitation instilled fear in them. However they indicated that facilitation had many benefits for them and for the students. Amongst the ones mentioned were personal and professional growth. Challenges mentioned were the fear that they waste time and that they do not cover the content. It is therefore important that facilitation be included in the training of nurse educators.

  17. Discovery of novel secondary metabolites in Aspergillus aculeatus

    DEFF Research Database (Denmark)

    Petersen, Lene Maj; Holm, Dorte Koefoed; Gotfredsen, Charlotte Held

    2012-01-01

    , whereby several novel secondary metabolites have been discovered. A. aculeatus has recently been genome-sequenced; however no genetic approaches have so far been described to facilitate genetic engineering. We here present a system for non-integrated (AMA1-based) gene expression in A. aculeatus based...... on the USERTM cloning technique. The AMA-1 based gene expression has successfully been applied to express genes in A. aculeatus and by this approach the function of a PKS gene has been established. Furthermore the technique was used to activate a silent cluster by expression of a transcription factor, leading...... of the industrially important black Aspergillus Aspergillus aculeatus by UHPLC-DAD-HRMS has identified several SMs already known from this organism. However, several compounds could not be unambiguously dereplicated wherefore some have been selected, purified and structure elucidated by 1D and 2D NMR spectroscopy...

  18. Upnp-Based Discovery And Management Of Hypervisors And Virtual Machines

    Directory of Open Access Journals (Sweden)

    Sławomir Zieliński

    2011-01-01

    Full Text Available The paper introduces a Universal Plug and Play based discovery and management toolkitthat facilitates collaboration between cloud infrastructure providers and users. The presentedtools construct a unified hierarchy of devices and their management-related services, thatrepresents the current deployment of users’ (virtual infrastructures in the provider’s (physicalinfrastructure as well as the management interfaces of respective devices. The hierarchycan be used to enhance the capabilities of the provider’s infrastructure management system.To maintain user independence, the set of management operations exposed by a particulardevice is always defined by the device owner (either the provider or user.

  19. 40 CFR 27.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 1 2010-07-01 2010-07-01 false Discovery. 27.21 Section 27.21... Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for..., discovery is available only as ordered by the presiding officer. The presiding officer shall regulate the...

  20. 37 CFR 41.150 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Discovery. 41.150 Section 41... COMMERCE PRACTICE BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES Contested Cases § 41.150 Discovery. (a) Limited discovery. A party is not entitled to discovery except as authorized in this subpart. The...

  1. 14 CFR 13.220 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Discovery. 13.220 Section 13.220... INVESTIGATIVE AND ENFORCEMENT PROCEDURES Rules of Practice in FAA Civil Penalty Actions § 13.220 Discovery. (a) Initiation of discovery. Any party may initiate discovery described in this section, without the consent or...

  2. 49 CFR 604.38 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 7 2010-10-01 2010-10-01 false Discovery. 604.38 Section 604.38 Transportation... TRANSPORTATION CHARTER SERVICE Hearings. § 604.38 Discovery. (a) Permissible forms of discovery shall be within the discretion of the PO. (b) The PO shall limit the frequency and extent of discovery permitted by...

  3. 15 CFR 719.10 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Discovery. 719.10 Section 719.10... Discovery. (a) General. The parties are encouraged to engage in voluntary discovery regarding any matter... the Federal Rules of Civil Procedure relating to discovery apply to the extent consistent with this...

  4. 24 CFR 26.18 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Discovery. 26.18 Section 26.18... PROCEDURES Hearings Before Hearing Officers Discovery § 26.18 Discovery. (a) General. The parties are encouraged to engage in voluntary discovery procedures, which may commence at any time after an answer has...

  5. 42 CFR 426.532 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 3 2010-10-01 2010-10-01 false Discovery. 426.532 Section 426.532 Public Health... § 426.532 Discovery. (a) General rule. If the Board orders discovery, the Board must establish a reasonable timeframe for discovery. (b) Protective order—(1) Request for a protective order. Any party...

  6. 49 CFR 1503.633 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 9 2010-10-01 2010-10-01 false Discovery. 1503.633 Section 1503.633... Rules of Practice in TSA Civil Penalty Actions § 1503.633 Discovery. (a) Initiation of discovery. Any party may initiate discovery described in this section, without the consent or approval of the ALJ, at...

  7. 14 CFR 1264.120 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Discovery. 1264.120 Section 1264.120... PENALTIES ACT OF 1986 § 1264.120 Discovery. (a) The following types of discovery are authorized: (1..., discovery is available only as ordered by the presiding officer. The presiding officer shall regulate the...

  8. 22 CFR 128.6 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Discovery. 128.6 Section 128.6 Foreign... Discovery. (a) Discovery by the respondent. The respondent, through the Administrative Law Judge, may... discovery if the interests of national security or foreign policy so require, or if necessary to comply with...

  9. 24 CFR 26.42 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Discovery. 26.42 Section 26.42... PROCEDURES Hearings Pursuant to the Administrative Procedure Act Discovery § 26.42 Discovery. (a) General. The parties are encouraged to engage in voluntary discovery procedures, which may commence at any time...

  10. 49 CFR 386.37 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 5 2010-10-01 2010-10-01 false Discovery. 386.37 Section 386.37 Transportation... and Hearings § 386.37 Discovery. (a) Parties may obtain discovery by one or more of the following...; and requests for admission. (b) Discovery may not commence until the matter is pending before the...

  11. 29 CFR 1955.32 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 9 2010-07-01 2010-07-01 false Discovery. 1955.32 Section 1955.32 Labor Regulations...) PROCEDURES FOR WITHDRAWAL OF APPROVAL OF STATE PLANS Preliminary Conference and Discovery § 1955.32 Discovery... allow discovery by any other appropriate procedure, such as by interrogatories upon a party or request...

  12. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Directory of Open Access Journals (Sweden)

    Yi-Hsiang Hsu

    2010-06-01

    Full Text Available Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD at the lumbar spine (LS and femoral neck (FN, as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW. A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8, 2q11.2 (TBC1D8, and 18q11.2 (OSBPL1A, and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13; SOX6, p = 6.4x10(-10 associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant

  13. MENINGKATKAN PEMAHAMAN MAHASISWA PADA MATA KULIAH KONSEP DASAR IPA MENGGUNAKAN MODEL PEMBELAJARAN CBSA DENGAN PENDEKATAN DISCOVERY

    Directory of Open Access Journals (Sweden)

    Subuh Anggoro

    2010-03-01

    Full Text Available This research is aimed at finding out whether students’ active learning with discovery approach could improve the students’ understanding of basic science subject. The research used survey method. The population of this research was students who took the subject of Konsep Dasar IPA (the Basic Concept of Science out of which one class was taken as sample. The data was then analyzed with descriptive technique with the help of graph. The result showed that 72% did not have sufficient understanding about the topic of energy and its transformation which was delivered without the help of learning aid and discussion. After learning aid was and discussion was facilitated by the lecturer, all the subjects of this research expressed their better improved understanding. Discovery method with experiment and discussion is crucial in helping students understand the basic concept of science. Key words: understanding, basic concept of science learning, students’ active learning, discovery technique.

  14. Advancing Drug Discovery through Enhanced Free Energy Calculations.

    Science.gov (United States)

    Abel, Robert; Wang, Lingle; Harder, Edward D; Berne, B J; Friesner, Richard A

    2017-07-18

    , is potentially transformative in enabling hard to drug targets to be attacked, and in facilitating the development of superior compounds, in various dimensions, for a wide range of targets. More effective integration of FEP+ calculations into the drug discovery process will ensure that the results are deployed in an optimal fashion for yielding the best possible compounds entering the clinic; this is where the greatest payoff is in the exploitation of computer driven design capabilities. A key conclusion from the work described is the surprisingly robust and accurate results that are attainable within the conventional classical simulation, fixed charge paradigm. No doubt there are individual cases that would benefit from a more sophisticated energy model or dynamical treatment, and properties other than protein-ligand binding energies may be more sensitive to these approximations. We conclude that an inflection point in the ability of MD simulations to impact drug discovery has now been attained, due to the confluence of hardware and software development along with the formulation of "good enough" theoretical methods and models.

  15. 42 CFR 426.432 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 3 2010-10-01 2010-10-01 false Discovery. 426.432 Section 426.432 Public Health... § 426.432 Discovery. (a) General rule. If the ALJ orders discovery, the ALJ must establish a reasonable timeframe for discovery. (b) Protective order—(1) Request for a protective order. Any party receiving a...

  16. 10 CFR 13.21 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 1 2010-01-01 2010-01-01 false Discovery. 13.21 Section 13.21 Energy NUCLEAR REGULATORY COMMISSION PROGRAM FRAUD CIVIL REMEDIES § 13.21 Discovery. (a) The following types of discovery are...) Unless mutually agreed to by the parties, discovery is available only as ordered by the ALJ. The ALJ...

  17. 49 CFR 1121.2 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 8 2010-10-01 2010-10-01 false Discovery. 1121.2 Section 1121.2 Transportation... TRANSPORTATION RULES OF PRACTICE RAIL EXEMPTION PROCEDURES § 1121.2 Discovery. Discovery shall follow the procedures set forth at 49 CFR part 1114, subpart B. Discovery may begin upon the filing of the petition for...

  18. 38 CFR 42.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 2 2010-07-01 2010-07-01 false Discovery. 42.21 Section... IMPLEMENTING THE PROGRAM FRAUD CIVIL REMEDIES ACT § 42.21 Discovery. (a) The following types of discovery are... creation of a document. (c) Unless mutually agreed to by the parties, discovery is available only as...

  19. 22 CFR 521.21 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 2 2010-04-01 2010-04-01 true Discovery. 521.21 Section 521.21 Foreign... Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for... interpreted to require the creation of a document. (c) Unless mutually agreed to by the parties, discovery is...

  20. 31 CFR 10.71 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 1 2010-07-01 2010-07-01 false Discovery. 10.71 Section 10.71 Money... SERVICE Rules Applicable to Disciplinary Proceedings § 10.71 Discovery. (a) In general. Discovery may be... relevance, materiality and reasonableness of the requested discovery and subject to the requirements of § 10...

  1. 39 CFR 955.15 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 39 Postal Service 1 2010-07-01 2010-07-01 false Discovery. 955.15 Section 955.15 Postal Service... APPEALS § 955.15 Discovery. (a) The parties are encouraged to engage in voluntary discovery procedures. In connection with any deposition or other discovery procedure, the Board may issue any order which justice...

  2. 43 CFR 35.21 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Discovery. 35.21 Section 35.21 Public... AND STATEMENTS § 35.21 Discovery. (a) The following types of discovery are authorized: (1) Requests...) Unless mutually agreed to by the parties, discovery is available only as ordered by the ALJ. The ALJ...

  3. 15 CFR 766.9 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Discovery. 766.9 Section 766.9... PROCEEDINGS § 766.9 Discovery. (a) General. The parties are encouraged to engage in voluntary discovery... provisions of the Federal Rules of Civil Procedure relating to discovery apply to the extent consistent with...

  4. An agent-based peer-to-peer architecture for semantic discovery of manufacturing services across virtual enterprises

    Science.gov (United States)

    Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu

    2015-04-01

    With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.

  5. Get Involved in Planetary Discoveries through New Worlds, New Discoveries

    Science.gov (United States)

    Shupla, Christine; Shipp, S. S.; Halligan, E.; Dalton, H.; Boonstra, D.; Buxner, S.; SMD Planetary Forum, NASA

    2013-01-01

    "New Worlds, New Discoveries" is a synthesis of NASA’s 50-year exploration history which provides an integrated picture of our new understanding of our solar system. As NASA spacecraft head to and arrive at key locations in our solar system, "New Worlds, New Discoveries" provides an integrated picture of our new understanding of the solar system to educators and the general public! The site combines the amazing discoveries of past NASA planetary missions with the most recent findings of ongoing missions, and connects them to the related planetary science topics. "New Worlds, New Discoveries," which includes the "Year of the Solar System" and the ongoing celebration of the "50 Years of Exploration," includes 20 topics that share thematic solar system educational resources and activities, tied to the national science standards. This online site and ongoing event offers numerous opportunities for the science community - including researchers and education and public outreach professionals - to raise awareness, build excitement, and make connections with educators, students, and the public about planetary science. Visitors to the site will find valuable hands-on science activities, resources and educational materials, as well as the latest news, to engage audiences in planetary science topics and their related mission discoveries. The topics are tied to the big questions of planetary science: how did the Sun’s family of planets and bodies originate and how have they evolved? How did life begin and evolve on Earth, and has it evolved elsewhere in our solar system? Scientists and educators are encouraged to get involved either directly or by sharing "New Worlds, New Discoveries" and its resources with educators, by conducting presentations and events, sharing their resources and events to add to the site, and adding their own public events to the site’s event calendar! Visit to find quality resources and ideas. Connect with educators, students and the public to

  6. 13 CFR 134.213 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 13 Business Credit and Assistance 1 2010-01-01 2010-01-01 false Discovery. 134.213 Section 134.213... OFFICE OF HEARINGS AND APPEALS Rules of Practice for Most Cases § 134.213 Discovery. (a) Motion. A party may obtain discovery only upon motion, and for good cause shown. (b) Forms. The forms of discovery...

  7. 31 CFR 16.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 31 Money and Finance: Treasury 1 2010-07-01 2010-07-01 false Discovery. 16.21 Section 16.21 Money... FRAUD CIVIL REMEDIES ACT OF 1986 § 16.21 Discovery. (a) The following types of discovery are authorized... to require the creation of a document. (c) Unless mutually agreed to by the parties, discovery is...

  8. Discovery and annotation of small proteins using genomics, proteomics and computational approaches

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Xiaohan; Tschaplinski, Timothy J.; Hurst, Gregory B.; Jawdy, Sara; Abraham, Paul E.; Lankford, Patricia K.; Adams, Rachel M.; Shah, Manesh B.; Hettich, Robert L.; Lindquist, Erika; Kalluri, Udaya C.; Gunter, Lee E.; Pennacchio, Christa; Tuskan, Gerald A.

    2011-03-02

    Small proteins (10 200 amino acids aa in length) encoded by short open reading frames (sORF) play important regulatory roles in various biological processes, including tumor progression, stress response, flowering, and hormone signaling. However, ab initio discovery of small proteins has been relatively overlooked. Recent advances in deep transcriptome sequencing make it possible to efficiently identify sORFs at the genome level. In this study, we obtained 2.6 million expressed sequence tag (EST) reads from Populus deltoides leaf transcriptome and reconstructed full-length transcripts from the EST sequences. We identified an initial set of 12,852 sORFs encoding proteins of 10 200 aa in length. Three computational approaches were then used to enrich for bona fide protein-coding sORFs from the initial sORF set: (1) codingpotential prediction, (2) evolutionary conservation between P. deltoides and other plant species, and (3) gene family clustering within P. deltoides. As a result, a high-confidence sORF candidate set containing 1469 genes was obtained. Analysis of the protein domains, non-protein-coding RNA motifs, sequence length distribution, and protein mass spectrometry data supported this high-confidence sORF set. In the high-confidence sORF candidate set, known protein domains were identified in 1282 genes (higher-confidence sORF candidate set), out of which 611 genes, designated as highest-confidence candidate sORF set, were supported by proteomics data. Of the 611 highest-confidence candidate sORF genes, 56 were new to the current Populus genome annotation. This study not only demonstrates that there are potential sORF candidates to be annotated in sequenced genomes, but also presents an efficient strategy for discovery of sORFs in species with no genome annotation yet available.

  9. Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate

    DEFF Research Database (Denmark)

    Andreassen, Ole A; Thompson, Wesley K; Schork, Andrew J

    2013-01-01

    are currently lacking. Here, we use a genetic pleiotropy-informed conditional false discovery rate (FDR) method on GWAS summary statistics data to identify new loci associated with schizophrenia (SCZ) and bipolar disorders (BD), two highly heritable disorders with significant missing heritability...... associated with both SCZ and BD (conjunction FDR). Together, these findings show the feasibility of genetic pleiotropy-informed methods to improve gene discovery in SCZ and BD and indicate overlapping genetic mechanisms between these two disorders....

  10. Invasion and persistence of a selfish gene in the Cnidaria.

    Directory of Open Access Journals (Sweden)

    Matthew R Goddard

    Full Text Available BACKGROUND: Homing endonuclease genes (HEGs are superfluous, but are capable of invading populations that mix alleles by biasing their inheritance patterns through gene conversion. One model suggests that their long-term persistence is achieved through recurrent invasion. This circumvents evolutionary degeneration, but requires reasonable rates of transfer between species to maintain purifying selection. Although HEGs are found in a variety of microbes, we found the previous discovery of this type of selfish genetic element in the mitochondria of a sea anemone surprising. METHODS/PRINCIPAL FINDINGS: We surveyed 29 species of Cnidaria for the presence of the COXI HEG. Statistical analyses provided evidence for HEG invasion. We also found that 96 individuals of Metridium senile, from five different locations in the UK, had identical HEG sequences. This lack of sequence divergence illustrates the stable nature of Anthozoan mitochondria. Our data suggests this HEG conforms to the recurrent invasion model of evolution. CONCLUSIONS: Ordinarily such low rates of HEG transfer would likely be insufficient to enable major invasion. However, the slow rate of Anthozoan mitochondrial change lengthens greatly the time to HEG degeneration: this significantly extends the periodicity of the HEG life-cycle. We suggest that a combination of very low substitution rates and rare transfers facilitated metazoan HEG invasion.

  11. Invasion and persistence of a selfish gene in the Cnidaria.

    Science.gov (United States)

    Goddard, Matthew R; Leigh, Jessica; Roger, Andrew J; Pemberton, Andrew J

    2006-12-20

    Homing endonuclease genes (HEGs) are superfluous, but are capable of invading populations that mix alleles by biasing their inheritance patterns through gene conversion. One model suggests that their long-term persistence is achieved through recurrent invasion. This circumvents evolutionary degeneration, but requires reasonable rates of transfer between species to maintain purifying selection. Although HEGs are found in a variety of microbes, we found the previous discovery of this type of selfish genetic element in the mitochondria of a sea anemone surprising. We surveyed 29 species of Cnidaria for the presence of the COXI HEG. Statistical analyses provided evidence for HEG invasion. We also found that 96 individuals of Metridium senile, from five different locations in the UK, had identical HEG sequences. This lack of sequence divergence illustrates the stable nature of Anthozoan mitochondria. Our data suggests this HEG conforms to the recurrent invasion model of evolution. Ordinarily such low rates of HEG transfer would likely be insufficient to enable major invasion. However, the slow rate of Anthozoan mitochondrial change lengthens greatly the time to HEG degeneration: this significantly extends the periodicity of the HEG life-cycle. We suggest that a combination of very low substitution rates and rare transfers facilitated metazoan HEG invasion.

  12. Unlocking the treasure trove: from genes to schizophrenia biology.

    Science.gov (United States)

    McCarthy, Shane E; McCombie, W Richard; Corvin, Aiden

    2014-05-01

    Significant progress is being made in defining the genetic etiology of schizophrenia. As the list of implicated genes grows, parallel developments in gene editing technology provide new methods to investigate gene function in model systems. The confluence of these two research fields--gene discovery and functional biology--may offer novel insights into schizophrenia etiology. We review recent advances in these fields, consider the likely obstacles to progress, and consider strategies as to how these can be overcome.

  13. Discovery and Characterization of Two Novel Salt-Tolerance Genes in Puccinellia tenuiflora

    Directory of Open Access Journals (Sweden)

    Ying Li

    2014-09-01

    Full Text Available Puccinellia tenuiflora is a monocotyledonous halophyte that is able to survive in extreme saline soil environments at an alkaline pH range of 9–10. In this study, we transformed full-length cDNAs of P. tenuiflora into Saccharomyces cerevisiae by using the full-length cDNA over-expressing gene-hunting system to identify novel salt-tolerance genes. In all, 32 yeast clones overexpressing P. tenuiflora cDNA were obtained by screening under NaCl stress conditions; of these, 31 clones showed stronger tolerance to NaCl and were amplified using polymerase chain reaction (PCR and sequenced. Four novel genes encoding proteins with unknown function were identified; these genes had no homology with genes from higher plants. Of the four isolated genes, two that encoded proteins with two transmembrane domains showed the strongest resistance to 1.3 M NaCl. RT-PCR and northern blot analysis of P. tenuiflora cultured cells confirmed the endogenous NaCl-induced expression of the two proteins. Both of the proteins conferred better tolerance in yeasts to high salt, alkaline and osmotic conditions, some heavy metals and H2O2 stress. Thus, we inferred that the two novel proteins might alleviate oxidative and other stresses in P. tenuiflora.

  14. The avian-origin PB1 gene segment facilitated replication and transmissibility of the H3N2/1968 pandemic influenza virus.

    Science.gov (United States)

    Wendel, Isabel; Rubbenstroth, Dennis; Doedt, Jennifer; Kochs, Georg; Wilhelm, Jochen; Staeheli, Peter; Klenk, Hans-Dieter; Matrosovich, Mikhail

    2015-04-01

    The H2N2/1957 and H3N2/1968 pandemic influenza viruses emerged via the exchange of genomic RNA segments between human and avian viruses. The avian hemagglutinin (HA) allowed the hybrid viruses to escape preexisting immunity in the human population. Both pandemic viruses further received the PB1 gene segment from the avian parent (Y. Kawaoka, S. Krauss, and R. G. Webster, J Virol 63:4603-4608, 1989), but the biological significance of this observation was not understood. To assess whether the avian-origin PB1 segment provided pandemic viruses with some selective advantage, either on its own or via cooperation with the homologous HA segment, we modeled by reverse genetics the reassortment event that led to the emergence of the H3N2/1968 pandemic virus. Using seasonal H2N2 virus A/California/1/66 (Cal) as a surrogate precursor human virus and pandemic virus A/Hong Kong/1/68 (H3N2) (HK) as a source of avian-derived PB1 and HA gene segments, we generated four reassortant recombinant viruses and compared pairs of viruses which differed solely by the origin of PB1. Replacement of the PB1 segment of Cal by PB1 of HK facilitated viral polymerase activity, replication efficiency in human cells, and contact transmission in guinea pigs. A combination of PB1 and HA segments of HK did not enhance replicative fitness of the reassortant virus compared with the single-gene PB1 reassortant. Our data suggest that the avian PB1 segment of the 1968 pandemic virus served to enhance viral growth and transmissibility, likely by enhancing activity of the viral polymerase complex. Despite the high impact of influenza pandemics on human health, some mechanisms underlying the emergence of pandemic influenza viruses still are poorly understood. Thus, it was unclear why both H2N2/1957 and H3N2/1968 reassortant pandemic viruses contained, in addition to the avian HA, the PB1 gene segment of the avian parent. Here, we addressed this long-standing question by modeling the emergence of the H3N2

  15. Single-Feature Polymorphism Discovery in the Transcriptome of Tetraploid Alfalfa

    Directory of Open Access Journals (Sweden)

    S. Samuel Yang

    2009-11-01

    Full Text Available Advances in alfalfa [ (L. subsp. ] breeding, molecular genetics, and genomics have been slow because this crop is an allogamous autotetraploid (2n = 4x = 32 with complex polysomic inheritance and few genomic resources. Increasing cellulose and decreasing lignin in alfalfa stem cell walls would improve this crop as a cellulosic ethanol feedstock. We conducted genome-wide analysis of single-feature polymorphisms (SFPs of two alfalfa genotypes (252, 1283 that differ in stem cell wall lignin and cellulose concentrations. SFP analysis was conducted using the GeneChip (Affymetrix, Santa Clara, CA as a cross-species platform. Analysis of GeneChip expression data files of alfalfa stem internodes of genotypes 252 and 1283 at two growth stages (elongating, post-elongation revealed 10,890 SFPs in 8230 probe sets. Validation analysis by polymerase chain reaction (PCR-sequencing of a random sample of SFPs indicated a 17% false discovery rate. Functional classification and over-representation analysis showed that genes involved in photosynthesis, stress response and cell wall biosynthesis were highly enriched among SFP-harboring genes. The GeneChip is a suitable cross-species platform for detecting SFPs in tetraploid alfalfa.

  16. Decades of Discovery

    Science.gov (United States)

    2011-06-01

    For the past two-and-a-half decades, the Office of Science at the U.S. Department of Energy has been at the forefront of scientific discovery. Over 100 important discoveries supported by the Office of Science are represented in this document.

  17. NCI Program for Natural Product Discovery: A Publicly-Accessible Library of Natural Product Fractions for High-Throughput Screening.

    Science.gov (United States)

    Thornburg, Christopher C; Britt, John R; Evans, Jason R; Akee, Rhone K; Whitt, James A; Trinh, Spencer K; Harris, Matthew J; Thompson, Jerell R; Ewing, Teresa L; Shipley, Suzanne M; Grothaus, Paul G; Newman, David J; Schneider, Joel P; Grkovic, Tanja; O'Keefe, Barry R

    2018-06-13

    The US National Cancer Institute's (NCI) Natural Product Repository is one of the world's largest, most diverse collections of natural products containing over 230,000 unique extracts derived from plant, marine, and microbial organisms that have been collected from biodiverse regions throughout the world. Importantly, this national resource is available to the research community for the screening of extracts and the isolation of bioactive natural products. However, despite the success of natural products in drug discovery, compatibility issues that make extracts challenging for liquid handling systems, extended timelines that complicate natural product-based drug discovery efforts and the presence of pan-assay interfering compounds have reduced enthusiasm for the high-throughput screening (HTS) of crude natural product extract libraries in targeted assay systems. To address these limitations, the NCI Program for Natural Product Discovery (NPNPD), a newly launched, national program to advance natural product discovery technologies and facilitate the discovery of structurally defined, validated lead molecules ready for translation will create a prefractionated library from over 125,000 natural product extracts with the aim of producing a publicly-accessible, HTS-amenable library of >1,000,000 fractions. This library, representing perhaps the largest accumulation of natural-product based fractions in the world, will be made available free of charge in 384-well plates for screening against all disease states in an effort to reinvigorate natural product-based drug discovery.

  18. Identifying potential maternal genes of Bombyx mori using digital gene expression profiling

    Science.gov (United States)

    Xu, Pingzhen

    2018-01-01

    Maternal genes present in mature oocytes play a crucial role in the early development of silkworm. Although maternal genes have been widely studied in many other species, there has been limited research in Bombyx mori. High-throughput next generation sequencing provides a practical method for gene discovery on a genome-wide level. Herein, a transcriptome study was used to identify maternal-related genes from silkworm eggs. Unfertilized eggs from five different stages of early development were used to detect the changing situation of gene expression. The expressed genes showed different patterns over time. Seventy-six maternal genes were annotated according to homology analysis with Drosophila melanogaster. More than half of the differentially expressed maternal genes fell into four expression patterns, while the expression patterns showed a downward trend over time. The functional annotation of these material genes was mainly related to transcription factor activity, growth factor activity, nucleic acid binding, RNA binding, ATP binding, and ion binding. Additionally, twenty-two gene clusters including maternal genes were identified from 18 scaffolds. Altogether, we plotted a profile for the maternal genes of Bombyx mori using a digital gene expression profiling method. This will provide the basis for maternal-specific signature research and improve the understanding of the early development of silkworm. PMID:29462160

  19. Discovery and the atom

    International Nuclear Information System (INIS)

    1989-01-01

    ''Discovery and the Atom'' tells the story of the founding of nuclear physics. This programme looks at nuclear physics up to the discovery of the neutron in 1932. Animation explains the science of the classic experiments, such as the scattering of alpha particles by Rutherford and the discovery of the nucleus. Archive film shows the people: Lord Rutherford, James Chadwick, Marie Curie. (author)

  20. Cross-organism learning method to discover new gene functionalities.

    Science.gov (United States)

    Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro

    2016-04-01

    Knowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly benefit from the availability of gene and protein functional feature descriptions expressed through controlled terminologies and ontologies, i.e., of gene and protein biomedical controlled annotations. In the last years, several databases of such annotations have become available; yet, these valuable annotations are incomplete, include errors and only some of them represent highly reliable human curated information. Computational techniques able to reliably predict new gene or protein annotations with an associated likelihood value are thus paramount. Here, we propose a novel cross-organisms learning approach to reliably predict new functionalities for the genes of an organism based on the known controlled annotations of the genes of another, evolutionarily related and better studied, organism. We leverage a new representation of the annotation discovery problem and a random perturbation of the available controlled annotations to allow the application of supervised algorithms to predict with good accuracy unknown gene annotations. Taking advantage of the numerous gene annotations available for a well-studied organism, our cross-organisms learning method creates and trains better prediction models, which can then be applied to predict new gene annotations of a target organism. We tested and compared our method with the equivalent single organism approach on different gene annotation datasets of five evolutionarily related organisms (Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum). Results show both the usefulness of the perturbation method of available annotations for better prediction model training and a great improvement of the cross-organism models with respect to the single-organism ones

  1. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    Science.gov (United States)

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.

  2. Gene Module Identification from Microarray Data Using Nonnegative Independent Component Analysis

    Directory of Open Access Journals (Sweden)

    Ting Gong

    2007-01-01

    Full Text Available Genes mostly interact with each other to form transcriptional modules for performing single or multiple functions. It is important to unravel such transcriptional modules and to determine how disturbances in them may lead to disease. Here, we propose a non-negative independent component analysis (nICA approach for transcriptional module discovery. nICA method utilizes the non-negativity constraint to enforce the independence of biological processes within the participated genes. In such, nICA decomposes the observed gene expression into positive independent components, which fi ts better to the reality of corresponding putative biological processes. In conjunction with nICA modeling, visual statistical data analyzer (VISDA is applied to group genes into modules in latent variable space. We demonstrate the usefulness of the approach through the identification of composite modules from yeast data and the discovery of pathway modules in muscle regeneration.

  3. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  4. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Directory of Open Access Journals (Sweden)

    Sara Kangaspeska

    Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  5. 29 CFR 2700.56 - Discovery; general.

    Science.gov (United States)

    2010-07-01

    ...(c) or 111 of the Act has been filed. 30 U.S.C. 815(c) and 821. (e) Completion of discovery... 29 Labor 9 2010-07-01 2010-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more...

  6. Discovery and Reuse of Open Datasets: An Exploratory Study

    Directory of Open Access Journals (Sweden)

    Sara

    2016-07-01

    Full Text Available Objective: This article analyzes twenty cited or downloaded datasets and the repositories that house them, in order to produce insights that can be used by academic libraries to encourage discovery and reuse of research data in institutional repositories. Methods: Using Thomson Reuters’ Data Citation Index and repository download statistics, we identified twenty cited/downloaded datasets. We documented the characteristics of the cited/downloaded datasets and their corresponding repositories in a self-designed rubric. The rubric includes six major categories: basic information; funding agency and journal information; linking and sharing; factors to encourage reuse; repository characteristics; and data description. Results: Our small-scale study suggests that cited/downloaded datasets generally comply with basic recommendations for facilitating reuse: data are documented well; formatted for use with a variety of software; and shared in established, open access repositories. Three significant factors also appear to contribute to dataset discovery: publishing in discipline-specific repositories; indexing in more than one location on the web; and using persistent identifiers. The cited/downloaded datasets in our analysis came from a few specific disciplines, and tended to be funded by agencies with data publication mandates. Conclusions: The results of this exploratory research provide insights that can inform academic librarians as they work to encourage discovery and reuse of institutional datasets. Our analysis also suggests areas in which academic librarians can target open data advocacy in their communities in order to begin to build open data success stories that will fuel future advocacy efforts.

  7. Promoter sequence of 3-phosphoglycerate kinase gene 1 of lactic acid-producing fungus rhizopus oryzae and a method of expressing a gene of interest in fungal species

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Johnway [Richland, WA; Skeen, Rodney S [Pendleton, OR

    2002-10-15

    The present invention provides the promoter clone discovery of phosphoglycerate kinase gene 1 of a lactic acid-producing filamentous fungal strain, Rhizopus oryzae. The isolated promoter can constitutively regulate gene expression under various carbohydrate conditions. In addition, the present invention also provides a design of an integration vector for the transformation of a foreign gene in Rhizopus oryzae.

  8. Promoter sequence of 3-phosphoglycerate kinase gene 2 of lactic acid-producing fungus rhizopus oryzae and a method of expressing a gene of interest in fungal species

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Johnway [Richland, WA; Skeen, Rodney S [Pendleton, OR

    2003-03-04

    The present invention provides the promoter clone discovery of phosphoglycerate kinase gene 2 of a lactic acid-producing filamentous fungal strain, Rhizopus oryzae. The isolated promoter can constitutively regulate gene expression under various carbohydrate conditions. In addition, the present invention also provides a design of an integration vector for the transformation of a foreign gene in Rhizopus oryzae.

  9. Experiences in fragment-based drug discovery.

    Science.gov (United States)

    Murray, Christopher W; Verdonk, Marcel L; Rees, David C

    2012-05-01

    Fragment-based drug discovery (FBDD) has become established in both industry and academia as an alternative approach to high-throughput screening for the generation of chemical leads for drug targets. In FBDD, specialised detection methods are used to identify small chemical compounds (fragments) that bind to the drug target, and structural biology is usually employed to establish their binding mode and to facilitate their optimisation. In this article, we present three recent and successful case histories in FBDD. We then re-examine the key concepts and challenges of FBDD with particular emphasis on recent literature and our own experience from a substantial number of FBDD applications. Our opinion is that careful application of FBDD is living up to its promise of delivering high quality leads with good physical properties and that in future many drug molecules will be derived from fragment-based approaches. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. History, Discovery, and Classification of lncRNAs.

    Science.gov (United States)

    Jarroux, Julien; Morillon, Antonin; Pinskaya, Marina

    2017-01-01

    The RNA World Hypothesis suggests that prebiotic life revolved around RNA instead of DNA and proteins. Although modern cells have changed significantly in 4 billion years, RNA has maintained its central role in cell biology. Since the discovery of DNA at the end of the nineteenth century, RNA has been extensively studied. Many discoveries such as housekeeping RNAs (rRNA, tRNA, etc.) supported the messenger RNA model that is the pillar of the central dogma of molecular biology, which was first devised in the late 1950s. Thirty years later, the first regulatory non-coding RNAs (ncRNAs) were initially identified in bacteria and then in most eukaryotic organisms. A few long ncRNAs (lncRNAs) such as H19 and Xist were characterized in the pre-genomic era but remained exceptions until the early 2000s. Indeed, when the sequence of the human genome was published in 2001, studies showed that only about 1.2% encodes proteins, the rest being deemed "non-coding." It was later shown that the genome is pervasively transcribed into many ncRNAs, but their functionality remained controversial. Since then, regulatory lncRNAs have been characterized in many species and were shown to be involved in processes such as development and pathologies, revealing a new layer of regulation in eukaryotic cells. This newly found focus on lncRNAs, together with the advent of high-throughput sequencing, was accompanied by the rapid discovery of many novel transcripts which were further characterized and classified according to specific transcript traits.In this review, we will discuss the many discoveries that led to the study of lncRNAs, from Friedrich Miescher's "nuclein" in 1869 to the elucidation of the human genome and transcriptome in the early 2000s. We will then focus on the biological relevance during lncRNA evolution and describe their basic features as genes and transcripts. Finally, we will present a non-exhaustive catalogue of lncRNA classes, thus illustrating the vast complexity of

  11. Bioluminescent bacteria: lux genes as environmental biosensors

    Directory of Open Access Journals (Sweden)

    Nunes-Halldorson Vânia da Silva

    2003-01-01

    Full Text Available Bioluminescent bacteria are widespread in natural environments. Over the years, many researchers have been studying the physiology, biochemistry and genetic control of bacterial bioluminescence. These discoveries have revolutionized the area of Environmental Microbiology through the use of luminescent genes as biosensors for environmental studies. This paper will review the chronology of scientific discoveries on bacterial bioluminescence and the current applications of bioluminescence in environmental studies, with special emphasis on the Microtox toxicity bioassay. Also, the general ecological significance of bioluminescence will be addressed.

  12. Microbial genome mining for accelerated natural products discovery: is a renaissance in the making?

    Science.gov (United States)

    Bachmann, Brian O; Van Lanen, Steven G; Baltz, Richard H

    2014-02-01

    Microbial genome mining is a rapidly developing approach to discover new and novel secondary metabolites for drug discovery. Many advances have been made in the past decade to facilitate genome mining, and these are reviewed in this Special Issue of the Journal of Industrial Microbiology and Biotechnology. In this Introductory Review, we discuss the concept of genome mining and why it is important for the revitalization of natural product discovery; what microbes show the most promise for focused genome mining; how microbial genomes can be mined; how genome mining can be leveraged with other technologies; how progress on genome mining can be accelerated; and who should fund future progress in this promising field. We direct interested readers to more focused reviews on the individual topics in this Special Issue for more detailed summaries on the current state-of-the-art.

  13. Hot or not? Discovery and characterization of a thermostable alditol oxidase from Acidothermus cellulolyticus 11B

    NARCIS (Netherlands)

    Winter, Remko T.; Heuts, Dominic P. H. M.; Rijpkema, Egon M. A.; van Bloois, Edwin; Wijma, Hein J.; Fraaije, Marco W.

    We describe the discovery, isolation and characterization of a highly thermostable alditol oxidase from Acidothermus cellulolyticus 11B. This protein was identified by searching the genomes of known thermophiles for enzymes homologous to Streptomyces coelicolor A3(2) alditol oxidase (AldO). A gene

  14. Bactérias bioluminescentes: os genes lux como biosensores ambientais

    OpenAIRE

    Nunes-Halldorson, Vânia da Silva; Duran, Norma Letícia

    2003-01-01

    Bioluminescent bacteria are widespread in natural environments. Over the years, many researchers have been studying the physiology, biochemistry and genetic control of bacterial bioluminescence. These discoveries have revolutionized the area of Environmental Microbiology through the use of luminescent genes as biosensors for environmental studies. This paper will review the chronology of scientific discoveries on bacterial bioluminescence and the current applications of bioluminescence in env...

  15. Bioinformatics and phylogenetic analysis of human Tp73 gene ...

    African Journals Online (AJOL)

    The Tp73 gene encoding p73 protein belongs to the Tp53 gene family and it functions in the initiation of cell-cycle arrest or apoptosis and also involves in regulating a series of pathways including breast cancer, neuroblastoma and cholorectal cancer. New discoveries about the control and function of p73 are still in progress ...

  16. A Tale of Two Discoveries: Comparing the Usability of Summon and EBSCO Discovery Service

    Science.gov (United States)

    Foster, Anita K.; MacDonald, Jean B.

    2013-01-01

    Web-scale discovery systems are gaining momentum among academic libraries as libraries seek a means to provide their users with a one-stop searching experience. Illinois State University's Milner Library found itself in the unique position of having access to two distinct discovery products, EBSCO Discovery Service and Serials Solutions' Summon.…

  17. Systematic identification of latent disease-gene associations from PubMed articles.

    Science.gov (United States)

    Zhang, Yuji; Shen, Feichen; Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang

    2018-01-01

    Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.

  18. Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

    Science.gov (United States)

    Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

    2017-12-26

    Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.

  19. Gene set analysis for interpreting genetic studies

    DEFF Research Database (Denmark)

    Pers, Tune H

    2016-01-01

    Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...

  20. 29 CFR 2200.208 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 9 2010-07-01 2010-07-01 false Discovery. 2200.208 Section 2200.208 Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH REVIEW COMMISSION RULES OF PROCEDURE Simplified Proceedings § 2200.208 Discovery. Discovery, including requests for admissions, will only be...

  1. 47 CFR 65.105 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 3 2010-10-01 2010-10-01 false Discovery. 65.105 Section 65.105... OF RETURN PRESCRIPTION PROCEDURES AND METHODOLOGIES Procedures § 65.105 Discovery. (a) Participants... evidence. (c) Discovery requests pursuant to § 65.105(b), including written interrogatories, shall be filed...

  2. 49 CFR 209.313 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 4 2010-10-01 2010-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...

  3. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    Science.gov (United States)

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase.

  4. De Novo Discovery of Structured ncRNA Motifs in Genomic Sequences

    DEFF Research Database (Denmark)

    Ruzzo, Walter L; Gorodkin, Jan

    2014-01-01

    De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphas...... on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.......De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis...

  5. Quantifying the Ease of Scientific Discovery.

    Science.gov (United States)

    Arbesman, Samuel

    2011-02-01

    It has long been known that scientific output proceeds on an exponential increase, or more properly, a logistic growth curve. The interplay between effort and discovery is clear, and the nature of the functional form has been thought to be due to many changes in the scientific process over time. Here I show a quantitative method for examining the ease of scientific progress, another necessary component in understanding scientific discovery. Using examples from three different scientific disciplines - mammalian species, chemical elements, and minor planets - I find the ease of discovery to conform to an exponential decay. In addition, I show how the pace of scientific discovery can be best understood as the outcome of both scientific output and ease of discovery. A quantitative study of the ease of scientific discovery in the aggregate, such as done here, has the potential to provide a great deal of insight into both the nature of future discoveries and the technical processes behind discoveries in science.

  6. Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

    Science.gov (United States)

    Obayashi, Takeshi; Kinoshita, Kengo

    2010-05-01

    Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.

  7. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...

  8. 15 CFR 280.210 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 1 2010-01-01 2010-01-01 false Discovery. 280.210 Section 280.210... STANDARDS AND TECHNOLOGY, DEPARTMENT OF COMMERCE ACCREDITATION AND ASSESSMENT PROGRAMS FASTENER QUALITY Enforcement § 280.210 Discovery. (a) General. The parties are encouraged to engage in voluntary discovery...

  9. A Comprehensive Classification and Evolutionary Analysis of Plant Homeobox Genes

    OpenAIRE

    Mukherjee, Krishanu; Brocchieri, Luciano; B?rglin, Thomas R.

    2009-01-01

    The full complement of homeobox transcription factor sequences, including genes and pseudogenes, was determined from the analysis of 10 complete genomes from flowering plants, moss, Selaginella, unicellular green algae, and red algae. Our exhaustive genome-wide searches resulted in the discovery in each class of a greater number of homeobox genes than previously reported. All homeobox genes can be unambiguously classified by sequence evolutionary analysis into 14 distinct classes also charact...

  10. Multiplex cDNA quantification method that facilitates the standardization of gene expression data

    Science.gov (United States)

    Gotoh, Osamu; Murakami, Yasufumi; Suyama, Akira

    2011-01-01

    Microarray-based gene expression measurement is one of the major methods for transcriptome analysis. However, current microarray data are substantially affected by microarray platforms and RNA references because of the microarray method can provide merely the relative amounts of gene expression levels. Therefore, valid comparisons of the microarray data require standardized platforms, internal and/or external controls and complicated normalizations. These requirements impose limitations on the extensive comparison of gene expression data. Here, we report an effective approach to removing the unfavorable limitations by measuring the absolute amounts of gene expression levels on common DNA microarrays. We have developed a multiplex cDNA quantification method called GEP-DEAN (Gene expression profiling by DCN-encoding-based analysis). The method was validated by using chemically synthesized DNA strands of known quantities and cDNA samples prepared from mouse liver, demonstrating that the absolute amounts of cDNA strands were successfully measured with a sensitivity of 18 zmol in a highly multiplexed manner in 7 h. PMID:21415008

  11. A glycogene mutation map for discovery of diseases of glycosylation

    DEFF Research Database (Denmark)

    Hansen, Lars; Lind-Thomsen, Allan; Joshi, Hiren J

    2015-01-01

    homologous families. However, Genome-Wide-Association Studies (GWAS) have identified such isoenzyme genes as candidates for different diseases, but validation is not straightforward without biomarkers. Large-scale whole exome sequencing (WES) provides access to mutations in e.g. glycosyltransferase genes...... in populations, which can be used to predict and/or analyze functional deleterious mutations. Here, we constructed a draft of a Functional Mutational Map of glycogenes, GlyMAP, from WES of a rather homogenous population of 2,000 Danes. We catalogued all missense mutations and used prediction algorithms, manual...... inspection, and in case of CAZy family GT27 experimental analysis of mutations to map deleterious mutations. GlyMAP provides a first global view of the genetic stability of the glycogenome and should serve as a tool for discovery of novel CDGs....

  12. Genome engineering for microbial natural product discovery.

    Science.gov (United States)

    Choi, Si-Sun; Katsuyama, Yohei; Bai, Linquan; Deng, Zixin; Ohnishi, Yasuo; Kim, Eung-Soo

    2018-03-03

    The discovery and development of microbial natural products (MNPs) have played pivotal roles in the fields of human medicine and its related biotechnology sectors over the past several decades. The post-genomic era has witnessed the development of microbial genome mining approaches to isolate previously unsuspected MNP biosynthetic gene clusters (BGCs) hidden in the genome, followed by various BGC awakening techniques to visualize compound production. Additional microbial genome engineering techniques have allowed higher MNP production titers, which could complement a traditional culture-based MNP chasing approach. Here, we describe recent developments in the MNP research paradigm, including microbial genome mining, NP BGC activation, and NP overproducing cell factory design. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. 10 CFR 1013.21 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 4 2010-01-01 2010-01-01 false Discovery. 1013.21 Section 1013.21 Energy DEPARTMENT OF ENERGY (GENERAL PROVISIONS) PROGRAM FRAUD CIVIL REMEDIES AND PROCEDURES § 1013.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and...

  14. 37 CFR 2.120 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Discovery. 2.120 Section 2... COMMERCE RULES OF PRACTICE IN TRADEMARK CASES Procedure in Inter Partes Proceedings § 2.120 Discovery. (a... to disclosure and discovery shall apply in opposition, cancellation, interference and concurrent use...

  15. 46 CFR 550.502 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 46 Shipping 9 2010-10-01 2010-10-01 false Discovery. 550.502 Section 550.502 Shipping FEDERAL... Proceedings § 550.502 Discovery. The Commission may authorize a party to a proceeding to use depositions, written interrogatories, and discovery procedures that, to the extent practicable, are in conformity with...

  16. 15 CFR 785.8 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Discovery. 785.8 Section 785.8... INDUSTRY AND SECURITY, DEPARTMENT OF COMMERCE ADDITIONAL PROTOCOL REGULATIONS ENFORCEMENT § 785.8 Discovery. (a) General. The parties are encouraged to engage in voluntary discovery regarding any matter, not...

  17. 22 CFR 35.21 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Discovery. 35.21 Section 35.21 Foreign Relations DEPARTMENT OF STATE CLAIMS AND STOLEN PROPERTY PROGRAM FRAUD CIVIL REMEDIES § 35.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for...

  18. 45 CFR 96.65 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 1 2010-10-01 2010-10-01 false Discovery. 96.65 Section 96.65 Public Welfare DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL ADMINISTRATION BLOCK GRANTS Hearing Procedure § 96.65 Discovery. The use of interrogatories, depositions, and other forms of discovery shall not be allowed. ...

  19. 49 CFR 31.21 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 49 Transportation 1 2010-10-01 2010-10-01 false Discovery. 31.21 Section 31.21 Transportation Office of the Secretary of Transportation PROGRAM FRAUD CIVIL REMEDIES § 31.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and...

  20. 43 CFR 4.1130 - Discovery methods.

    Science.gov (United States)

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...

  1. NASA's GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain V.; Tran, Peter B.

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate 'open science' biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics ('omics') data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  2. NASAs GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain; Tran, Peter

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate open science biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics (omics) data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  3. 6 CFR 13.21 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 6 Domestic Security 1 2010-01-01 2010-01-01 false Discovery. 13.21 Section 13.21 Domestic Security DEPARTMENT OF HOMELAND SECURITY, OFFICE OF THE SECRETARY PROGRAM FRAUD CIVIL REMEDIES § 13.21 Discovery. (a) In general. (1) The following types of discovery are authorized: (i) Requests for production of...

  4. 45 CFR 99.23 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 1 2010-10-01 2010-10-01 false Discovery. 99.23 Section 99.23 Public Welfare... DEVELOPMENT FUND Hearing Procedures § 99.23 Discovery. The Department, the Lead Agency, and any individuals or groups recognized as parties shall have the right to conduct discovery (including depositions) against...

  5. 20 CFR 355.21 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 20 Employees' Benefits 1 2010-04-01 2010-04-01 false Discovery. 355.21 Section 355.21 Employees... UNDER THE PROGRAM FRAUD CIVIL REMEDIES ACT OF 1986 § 355.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and copying; (2) Requests...

  6. 10 CFR 2.1018 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 1 2010-01-01 2010-01-01 false Discovery. 2.1018 Section 2.1018 Energy NUCLEAR REGULATORY... Geologic Repository § 2.1018 Discovery. (a)(1) Parties, potential parties, and interested governmental participants in the high-level waste licensing proceeding may obtain discovery by one or more of the following...

  7. 28 CFR 71.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 2 2010-07-01 2010-07-01 false Discovery. 71.21 Section 71.21 Judicial... REMEDIES ACT OF 1986 Implementation for Actions Initiated by the Department of Justice § 71.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for...

  8. 13 CFR 134.310 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 13 Business Credit and Assistance 1 2010-01-01 2010-01-01 false Discovery. 134.310 Section 134.310 Business Credit and Assistance SMALL BUSINESS ADMINISTRATION RULES OF PROCEDURE GOVERNING CASES BEFORE THE... Designations § 134.310 Discovery. Discovery will not be permitted in appeals from size determinations or NAICS...

  9. 34 CFR 33.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 34 Education 1 2010-07-01 2010-07-01 false Discovery. 33.21 Section 33.21 Education Office of the Secretary, Department of Education PROGRAM FRAUD CIVIL REMEDIES ACT § 33.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and copying...

  10. 28 CFR 18.7 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 28 Judicial Administration 1 2010-07-01 2010-07-01 false Discovery. 18.7 Section 18.7 Judicial Administration DEPARTMENT OF JUSTICE OFFICE OF JUSTICE PROGRAMS HEARING AND APPEAL PROCEDURES § 18.7 Discovery.... Such order may be entered upon a showing that the deposition is necessary for discovery purposes, and...

  11. 7 CFR 1.322 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 1 2010-01-01 2010-01-01 false Discovery. 1.322 Section 1.322 Agriculture Office of... Under the Program Fraud Civil Remedies Act of 1986 § 1.322 Discovery. (a) The following types of discovery are authorized: (1) Requests for production, inspection and photocopying of documents; (2...

  12. 45 CFR 1386.103 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 4 2010-10-01 2010-10-01 false Discovery. 1386.103 Section 1386.103 Public... Hearing Procedures § 1386.103 Discovery. The Department and any party named in the Notice issued pursuant to § 1386.90 has the right to conduct discovery (including depositions) against opposing parties as...

  13. 45 CFR 79.21 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 1 2010-10-01 2010-10-01 false Discovery. 79.21 Section 79.21 Public Welfare DEPARTMENT OF HEALTH AND HUMAN SERVICES GENERAL ADMINISTRATION PROGRAM FRAUD CIVIL REMEDIES § 79.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for...

  14. 12 CFR 308.520 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Discovery. 308.520 Section 308.520 Banks and... PROCEDURE Program Fraud Civil Remedies and Procedures § 308.520 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and copying; (2) Requests...

  15. 47 CFR 1.729 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Discovery. 1.729 Section 1.729..., and Reports Involving Common Carriers Formal Complaints § 1.729 Discovery. (a) Subject to paragraph (i... seek discovery of any non-privileged matter that is relevant to the material facts in dispute in the...

  16. 7 CFR 283.12 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 4 2010-01-01 2010-01-01 false Discovery. 283.12 Section 283.12 Agriculture... of $50,000 or More § 283.12 Discovery. (a) Dispositions—(1) Motion for taking deposition. Only upon a... exist if the information sought appears reasonably calculated to lead to the discovery of admissible...

  17. 42 CFR 1005.7 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 5 2010-10-01 2010-10-01 false Discovery. 1005.7 Section 1005.7 Public Health... OF EXCLUSIONS, CIVIL MONEY PENALTIES AND ASSESSMENTS § 1005.7 Discovery. (a) A party may make a... and any forms of discovery, other than those permitted under paragraph (a) of this section, are not...

  18. 29 CFR 1603.210 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 4 2010-07-01 2010-07-01 false Discovery. 1603.210 Section 1603.210 Labor Regulations... GOVERNMENT EMPLOYEE RIGHTS ACT OF 1991 Hearings § 1603.210 Discovery. (a) Unless otherwise ordered by the administrative law judge, discovery may begin as soon as the complaint has been transmitted to the administrative...

  19. 45 CFR 150.435 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 1 2010-10-01 2010-10-01 false Discovery. 150.435 Section 150.435 Public Welfare... AND INDIVIDUAL INSURANCE MARKETS Administrative Hearings § 150.435 Discovery. (a) The parties must identify any need for discovery from the opposing party as soon as possible, but no later than the time for...

  20. 34 CFR 81.16 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 34 Education 1 2010-07-01 2010-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...

  1. 29 CFR 1905.25 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 5 2010-07-01 2010-07-01 false Discovery. 1905.25 Section 1905.25 Labor Regulations... OCCUPATIONAL SAFETY AND HEALTH ACT OF 1970 Hearings § 1905.25 Discovery. (a) Depositions. (1) For reasons of... discovery. Whenever appropriate to a just disposition of any issue in a hearing, the presiding hearing...

  2. 12 CFR 1780.26 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Discovery. 1780.26 Section 1780.26 Banks and... OF PRACTICE AND PROCEDURE RULES OF PRACTICE AND PROCEDURE Prehearing Proceedings § 1780.26 Discovery. (a) Limits on discovery. Subject to the limitations set out in paragraphs (b), (d), and (e) of this...

  3. 45 CFR 160.516 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 45 Public Welfare 1 2010-10-01 2010-10-01 false Discovery. 160.516 Section 160.516 Public Welfare... ADMINISTRATIVE REQUIREMENTS Procedures for Hearings § 160.516 Discovery. (a) A party may make a request to... forms of discovery, other than those permitted under paragraph (a) of this section, are not authorized...

  4. 42 CFR 430.86 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 4 2010-10-01 2010-10-01 false Discovery. 430.86 Section 430.86 Public Health... Plans and Practice to Federal Requirements § 430.86 Discovery. CMS and any party named in the notice issued under § 430.70 has the right to conduct discovery (including depositions) against opposing parties...

  5. Evolution of melanopsin photoreceptors: discovery and characterization of a new melanopsin in nonmammalian vertebrates.

    Directory of Open Access Journals (Sweden)

    James Bellingham

    2006-07-01

    Full Text Available In mammals, the melanopsin gene (Opn4 encodes a sensory photopigment that underpins newly discovered inner retinal photoreceptors. Since its first discovery in Xenopus laevis and subsequent description in humans and mice, melanopsin genes have been described in all vertebrate classes. Until now, all of these sequences have been considered representatives of a single orthologous gene (albeit with duplications in the teleost fish. Here, we describe the discovery and functional characterisation of a new melanopsin gene in fish, bird, and amphibian genomes, demonstrating that, in fact, the vertebrates have evolved two quite separate melanopsins. On the basis of sequence similarity, chromosomal localisation, and phylogeny, we identify our new melanopsins as the true orthologs of the melanopsin gene previously described in mammals and term this grouping Opn4m. By contrast, the previously published melanopsin genes in nonmammalian vertebrates represent a separate branch of the melanopsin family which we term Opn4x. RT-PCR analysis in chicken, zebrafish, and Xenopus identifies expression of both Opn4m and Opn4x genes in tissues known to be photosensitive (eye, brain, and skin. In the day-14 chicken eye, Opn4m mRNA is found in a subset of cells in the outer nuclear, inner nuclear, and ganglion cell layers, the vast majority of which also express Opn4x. Importantly, we show that a representative of the new melanopsins (chicken Opn4m encodes a photosensory pigment capable of activating G protein signalling cascades in a light- and retinaldehyde-dependent manner under heterologous expression in Neuro-2a cells. A comprehensive in silico analysis of vertebrate genomes indicates that while most vertebrate species have both Opn4m and Opn4x genes, the latter is absent from eutherian and, possibly, marsupial mammals, lost in the course of their evolution as a result of chromosomal reorganisation. Thus, our findings show for the first time that nonmammalian

  6. Discovery Driven Growth

    DEFF Research Database (Denmark)

    Bukh, Per Nikolaj

    2009-01-01

    Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august......Anmeldelse af Discovery Driven Growh : A breakthrough process to reduce risk and seize opportunity, af Rita G. McGrath & Ian C. MacMillan, Boston: Harvard Business Press. Udgivelsesdato: 14 august...

  7. FARO server: Meta-analysis of gene expression by matching gene expression signatures to a compendium of public gene expression data

    DEFF Research Database (Denmark)

    Manijak, Mieszko P.; Nielsen, Henrik Bjørn

    2011-01-01

    circumvented by instead matching gene expression signatures to signatures of other experiments. FINDINGS: To facilitate this we present the Functional Association Response by Overlap (FARO) server, that match input signatures to a compendium of 242 gene expression signatures, extracted from more than 1700...... Arabidopsis microarray experiments. CONCLUSIONS: Hereby we present a publicly available tool for robust characterization of Arabidopsis gene expression experiments which can point to similar experimental factors in other experiments. The server is available at http://www.cbs.dtu.dk/services/faro/....

  8. 42 CFR 405.1037 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 2 2010-10-01 2010-10-01 false Discovery. 405.1037 Section 405.1037 Public Health... Appeals Under Original Medicare (Part A and Part B) Alj Hearings § 405.1037 Discovery. (a) General rules. (1) Discovery is permissible only when CMS or its contractor elects to participate in an ALJ hearing...

  9. 20 CFR 498.207 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Discovery. 498.207 Section 498.207 Employees... § 498.207 Discovery. (a) For the purpose of inspection and copying, a party may make a request to...) Any form of discovery other than that permitted under paragraph (a) of this section, such as requests...

  10. 42 CFR 93.512 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 1 2010-10-01 2010-10-01 false Discovery. 93.512 Section 93.512 Public Health... Process § 93.512 Discovery. (a) Request to provide documents. A party may only request another party to...) Responses to a discovery request. Within 30 days of receiving a request for the production of documents, a...

  11. 42 CFR 3.516 - Discovery.

    Science.gov (United States)

    2010-10-01

    ... 42 Public Health 1 2010-10-01 2010-10-01 false Discovery. 3.516 Section 3.516 Public Health PUBLIC... AND PATIENT SAFETY WORK PRODUCT Enforcement Program § 3.516 Discovery. (a) A party may make a request... and any forms of discovery, other than those permitted under paragraph (a) of this section, are not...

  12. 29 CFR 22.21 - Discovery.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 1 2010-07-01 2010-07-01 true Discovery. 22.21 Section 22.21 Labor Office of the Secretary of Labor PROGRAM FRAUD CIVIL REMEDIES ACT OF 1986 § 22.21 Discovery. (a) The following types of discovery are authorized: (1) Requests for production of documents for inspection and copying; (2) Requests...

  13. Molecular mechanisms of D-cycloserine in facilitating fear extinction: insights from RNAseq.

    Science.gov (United States)

    Malan-Müller, Stefanie; Fairbairn, Lorren; Daniels, Willie M U; Dashti, Mahjoubeh Jalali Sefid; Oakeley, Edward J; Altorfer, Marc; Kidd, Martin; Seedat, Soraya; Gamieldien, Junaid; Hemmings, Sîan Megan Joanna

    2016-02-01

    D-cycloserine (DCS) has been shown to be effective in facilitating fear extinction in animal and human studies, however the precise mechanisms whereby the co-administration of DCS and behavioural fear extinction reduce fear are still unclear. This study investigated the molecular mechanisms of intrahippocampally administered D-cycloserine in facilitating fear extinction in a contextual fear conditioning animal model. Male Sprague Dawley rats (n = 120) were grouped into four experimental groups (n = 30) based on fear conditioning and intrahippocampal administration of either DCS or saline. The light/dark avoidance test was used to differentiate maladapted (MA) (anxious) from well-adapted (WA) (not anxious) subgroups. RNA extracted from the left dorsal hippocampus was used for RNA sequencing and gene expression data was compared between six fear-conditioned + saline MA (FEAR + SALINE MA) and six fear-conditioned + DCS WA (FEAR + DCS WA) animals. Of the 424 significantly downregulated and 25 significantly upregulated genes identified in the FEAR + DCS WA group compared to the FEAR + SALINE MA group, 121 downregulated and nine upregulated genes were predicted to be relevant to fear conditioning and anxiety and stress-related disorders. The majority of downregulated genes transcribed immune, proinflammatory and oxidative stress systems molecules. These molecules mediate neuroinflammation and cause neuronal damage. DCS also regulated genes involved in learning and memory processes, and genes associated with anxiety, stress-related disorders and co-occurring diseases (e.g., cardiovascular diseases, digestive system diseases and nervous system diseases). Identifying the molecular underpinnings of DCS-mediated fear extinction brings us closer to understanding the process of fear extinction.

  14. Computational methods in drug discovery

    Directory of Open Access Journals (Sweden)

    Sumudu P. Leelananda

    2016-12-01

    Full Text Available The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein–ligand docking, pharmacophore modeling and QSAR techniques are reviewed.

  15. MIPHENO: Data normalization for high throughput metabolic analysis.

    Science.gov (United States)

    High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course...

  16. Analysis of cassava (Manihot esculenta) ESTs: A tool for the discovery of genes

    International Nuclear Information System (INIS)

    Zapata, Andres; Neme, Rafik; Sanabria, Carolina; Lopez, Camilo

    2011-01-01

    Cassava (Manihot esculenta) is the main source of calories for more than 1,000 millions of people around the world and has been consolidated as the fourth most important crop after rice, corn and wheat. Cassava is considered tolerant to abiotic and biotic stress conditions; nevertheless these characteristics are mainly present in non-commercial varieties. Genetic breeding strategies represent an alternative to introduce the desirable characteristics into commercial varieties. A fundamental step for accelerating the genetic breeding process in cassava requires the identification of genes associated to these characteristics. One rapid strategy for the identification of genes is the possibility to have a large collection of ESTs (expressed sequence tag). In this study, a complete analysis of cassava ESTs was done. The cassava ESTs represent 80,459 sequences which were assembled in a set of 29,231 unique genes (unigen), comprising 10,945 contigs and 18,286 singletones. These 29,231 unique genes represent about 80% of the genes of the cassava's genome. Between 5% and 10% of the unigenes of cassava not show similarity to any sequences present in the NCBI database and could be consider as cassava specific genes. a functional category was assigned to a group of sequences of the unigen set (29%) following the Gene Ontology Vocabulary. the molecular function component was the best represented with 43% of the sequences, followed by the biological process component (38%) and finally the cellular component with 19%. in the cassava ESTs collection, 3,709 microsatellites were identified and they could be used as molecular markers. this study represents an important contribution to the knowledge of the functional genomic structure of cassava and constitutes an important tool for the identification of genes associated to agricultural characteristics of interest that could be employed in cassava breeding programs.

  17. Discovery and characterization of two new stem rust resistance genes in Aegilops sharonensis.

    Science.gov (United States)

    Yu, Guotai; Champouret, Nicolas; Steuernagel, Burkhard; Olivera, Pablo D; Simmons, Jamie; Williams, Cole; Johnson, Ryan; Moscou, Matthew J; Hernández-Pinzón, Inmaculada; Green, Phon; Sela, Hanan; Millet, Eitan; Jones, Jonathan D G; Ward, Eric R; Steffenson, Brian J; Wulff, Brande B H

    2017-06-01

    We identified two novel wheat stem rust resistance genes, Sr-1644-1Sh and Sr-1644-5Sh in Aegilops sharonensis that are effective against widely virulent African races of the wheat stem rust pathogen. Stem rust is one of the most important diseases of wheat in the world. When single stem rust resistance (Sr) genes are deployed in wheat, they are often rapidly overcome by the pathogen. To this end, we initiated a search for novel sources of resistance in diverse wheat relatives and identified the wild goatgrass species Aegilops sharonesis (Sharon goatgrass) as a rich reservoir of resistance to wheat stem rust. The objectives of this study were to discover and map novel Sr genes in Ae. sharonensis and to explore the possibility of identifying new Sr genes by genome-wide association study (GWAS). We developed two biparental populations between resistant and susceptible accessions of Ae. sharonensis and performed QTL and linkage analysis. In an F 6 recombinant inbred line and an F 2 population, two genes were identified that mapped to the short arm of chromosome 1S sh , designated as Sr-1644-1Sh, and the long arm of chromosome 5S sh , designated as Sr-1644-5Sh. The gene Sr-1644-1Sh confers a high level of resistance to race TTKSK (a member of the Ug99 race group), while the gene Sr-1644-5Sh conditions strong resistance to TRTTF, another widely virulent race found in Yemen. Additionally, GWAS was conducted on 125 diverse Ae. sharonensis accessions for stem rust resistance. The gene Sr-1644-1Sh was detected by GWAS, while Sr-1644-5Sh was not detected, indicating that the effectiveness of GWAS might be affected by marker density, population structure, low allele frequency and other factors.

  18. 10 CFR 205.198 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 10 Energy 3 2010-01-01 2010-01-01 false Discovery. 205.198 Section 205.198 Energy DEPARTMENT OF... of Proposed Disallowance, and Order of Disallowance § 205.198 Discovery. (a) If a person intends to file a Motion for Discovery, he must file it at the same time that he files his Statement of Objections...

  19. 12 CFR 908.46 - Discovery.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Discovery. 908.46 Section 908.46 Banks and... PRACTICE AND PROCEDURE IN HEARINGS ON THE RECORD Pre-Hearing Proceedings § 908.46 Discovery. (a) Limits on discovery. Subject to the limitations set out in paragraphs (b), (d), and (e) of this section, any party to...

  20. 21 CFR 17.23 - Discovery.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 1 2010-04-01 2010-04-01 false Discovery. 17.23 Section 17.23 Food and Drugs FOOD... HEARINGS § 17.23 Discovery. (a) No later than 60 days prior to the hearing, unless otherwise ordered by the..., depositions, and any forms of discovery, other than those permitted under paragraphs (a) and (e) of this...

  1. Cloning of partial cry1Ac gene from an indigenous isolate of Bacillus ...

    African Journals Online (AJOL)

    The discoveries of novel cry genes of Bacillus thuringiensis (Bt) with higher toxicity are important for the development of new products. The cry1 family genes are more toxic to the lepidopteran insects according to the previous reports. In the present study, nine indigenous isolates of Bt were used for screening of cry1 genes ...

  2. Virtual Hubs for facilitating access to Open Data

    Science.gov (United States)

    Mazzetti, Paolo; Latre, Miguel Á.; Ernst, Julia; Brumana, Raffaella; Brauman, Stefan; Nativi, Stefano

    2015-04-01

    In October 2014 the ENERGIC-OD (European NEtwork for Redistributing Geospatial Information to user Communities - Open Data) project, funded by the European Union under the Competitiveness and Innovation framework Programme (CIP), has started. In response to the EU call, the general objective of the project is to "facilitate the use of open (freely available) geographic data from different sources for the creation of innovative applications and services through the creation of Virtual Hubs". In ENERGIC-OD, Virtual Hubs are conceived as information systems supporting the full life cycle of Open Data: publishing, discovery and access. They facilitate the use of Open Data by lowering and possibly removing the main barriers which hampers geo-information (GI) usage by end-users and application developers. Data and data services heterogeneity is recognized as one of the major barriers to Open Data (re-)use. It imposes end-users and developers to spend a lot of effort in accessing different infrastructures and harmonizing datasets. Such heterogeneity cannot be completely removed through the adoption of standard specifications for service interfaces, metadata and data models, since different infrastructures adopt different standards to answer to specific challenges and to address specific use-cases. Thus, beyond a certain extent, heterogeneity is irreducible especially in interdisciplinary contexts. ENERGIC-OD Virtual Hubs address heterogeneity adopting a mediation and brokering approach: specific components (brokers) are dedicated to harmonize service interfaces, metadata and data models, enabling seamless discovery and access to heterogeneous infrastructures and datasets. As an innovation project, ENERGIC-OD will integrate several existing technologies to implement Virtual Hubs as single points of access to geospatial datasets provided by new or existing platforms and infrastructures, including INSPIRE-compliant systems and Copernicus services. ENERGIC OD will deploy a

  3. Diversifying Sunflower Germplasm by Integration and Mapping of a Novel Male Fertility Restoration Gene

    Science.gov (United States)

    Liu, Zhao; Wang, Dexing; Feng, Jiuhuan; Seiler, Gerald J.; Cai, Xiwen; Jan, Chao-Chien

    2013-01-01

    The combination of a single cytoplasmic male-sterile (CMS) PET-1 and the corresponding fertility restoration (Rf) gene Rf1 is used for commercial hybrid sunflower (Helianthus annuus L., 2n = 34) seed production worldwide. A new CMS line 514A was recently developed with H. tuberosus cytoplasm. However, 33 maintainers and restorers for CMS PET-1 and 20 additional tester lines failed to restore the fertility of CMS 514A. Here, we report the discovery, characterization, and molecular mapping of a novel Rf gene for CMS 514A derived from an amphiploid (Amp H. angustifolius/P 21, 2n = 68). Progeny analysis of the male-fertile (MF) plants (2n = 35) suggested that this gene, designated Rf6, was located on a single alien chromosome. Genomic in situ hybridization (GISH) indicated that Rf6 was on a chromosome with a small segment translocation on the long arm in the MF progenies (2n = 34). Rf6 was mapped to linkage group (LG) 3 of the sunflower SSR map. Eight markers were identified to be linked to this gene, covering a distance of 10.8 cM. Two markers, ORS13 and ORS1114, were only 1.6 cM away from the gene. Severe segregation distortions were observed for both the fertility trait and the linked marker loci, suggesting the possibility of a low frequency of recombination or gamete selection in this region. This study discovered a new CMS/Rf gene system derived from wild species and provided significant insight into the genetic basis of this system. This will diversify the germplasm for sunflower breeding and facilitate understanding of the interaction between the cytoplasm and nuclear genes. PMID:23307903

  4. Direct interactions of OCA-B and TFII-I regulate immunoglobulin heavy-chain gene transcription by facilitating enhancer-promoter communication.

    Science.gov (United States)

    Ren, Xiaodi; Siegel, Rachael; Kim, Unkyu; Roeder, Robert G

    2011-05-06

    B cell-specific coactivator OCA-B, together with Oct-1/2, binds to octamer sites in promoters and enhancers to activate transcription of immunoglobulin (Ig) genes, although the mechanisms underlying their roles in enhancer-promoter communication are unknown. Here, we demonstrate a direct interaction of OCA-B with transcription factor TFII-I, which binds to DICE elements in Igh promoters, that affects transcription at two levels. First, OCA-B relieves HDAC3-mediated Igh promoter repression by competing with HDAC3 for binding to promoter-bound TFII-I. Second, and most importantly, Igh 3' enhancer-bound OCA-B and promoter-bound TFII-I mediate promoter-enhancer interactions, in both cis and trans, that are important for Igh transcription. These and other results reveal an important function for OCA-B in Igh 3' enhancer function in vivo and strongly favor an enhancer mechanism involving looping and facilitated factor recruitment rather than a tracking mechanism. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. You've gotta be lucky: Coverage and the elusive gene-gene interaction.

    Science.gov (United States)

    Reimherr, Matthew; Nicolae, Dan L

    2011-01-01

    Genome-wide association studies (GWAS) have led to a large number of single-SNP association findings, but there has been, so far, no investigation resulting in the discovery of a replicable gene-gene interaction. In this paper, we examine some of the possible explanations for the lack of findings, and argue that coverage of causal variation not only has a large effect on the loss in power, but that the effect is larger than in the single-SNP analyses. We show that the product of linkage disequilibrium measures, r², between causal and tested SNPs offers a good approximation to the loss in efficiency as defined by the ratio of sample sizes that lead to similar power. We also demonstrate that, in addition to the huge search space, the loss in power due to coverage when using commercially available platforms makes the search for gene-gene interactions daunting. © 2010 The Authors Annals of Human Genetics © 2010 Blackwell Publishing Ltd/University College London.

  6. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    DEFF Research Database (Denmark)

    Feng, Cai-ping; Mundy, J.

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions, TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also discus...

  7. The circumstances of minor planet discovery

    International Nuclear Information System (INIS)

    Pilcher, F.

    1989-01-01

    The circumstances of discoveries of minor planets are presented in tabular form. Complete data are given for planets 2125-4044, together with notes pertaining to these planets. Information in the table includes the permanent number; the official name; for planets 330 and forward, the table includes the provisional designation attached to the discovery apparition and the year, month, the day of discovery, and the discovery place

  8. Amplification of TLO Mediator Subunit Genes Facilitate Filamentous Growth in Candida Spp.

    Science.gov (United States)

    Liu, Zhongle; Moran, Gary P.; Myers, Lawrence C.

    2016-01-01

    Filamentous growth is a hallmark of C. albicans pathogenicity compared to less-virulent ascomycetes. A multitude of transcription factors regulate filamentous growth in response to specific environmental cues. Our work, however, suggests the evolutionary history of C. albicans that resulted in its filamentous growth plasticity may be tied to a change in the general transcription machinery rather than transcription factors and their specific targets. A key genomic difference between C. albicans and its less-virulent relatives, including its closest relative C. dubliniensis, is the unique expansion of the TLO (TeLOmere-associated) gene family in C. albicans. Individual Tlo proteins are fungal-specific subunits of Mediator, a large multi-subunit eukaryotic transcriptional co-activator complex. This amplification results in a large pool of ‘free,’ non-Mediator associated, Tlo protein present in C. albicans, but not in C. dubliniensis or other ascomycetes with attenuated virulence. We show that engineering a large ‘free’ pool of the C. dubliniensis Tlo2 (CdTlo2) protein in C. dubliniensis, through overexpression, results in a number of filamentation phenotypes typically associated only with C. albicans. The amplitude of these phenotypes is proportional to the amount of overexpressed CdTlo2 protein. Overexpression of other C. dubliniensis and C. albicans Tlo proteins do result in these phenotypes. Tlo proteins and their orthologs contain a Mediator interaction domain, and a potent transcriptional activation domain. Nuclear localization of the CdTlo2 activation domain, facilitated naturally by the Tlo Mediator binding domain or artificially through an appended nuclear localization signal, is sufficient for the CdTlo2 overexpression phenotypes. A C. albicans med3 null mutant causes multiple defects including the inability to localize Tlo proteins to the nucleus and reduced virulence in a murine systemic infection model. Our data supports a model in which the

  9. Amplification of TLO Mediator Subunit Genes Facilitate Filamentous Growth in Candida Spp.

    Directory of Open Access Journals (Sweden)

    Zhongle Liu

    2016-10-01

    Full Text Available Filamentous growth is a hallmark of C. albicans pathogenicity compared to less-virulent ascomycetes. A multitude of transcription factors regulate filamentous growth in response to specific environmental cues. Our work, however, suggests the evolutionary history of C. albicans that resulted in its filamentous growth plasticity may be tied to a change in the general transcription machinery rather than transcription factors and their specific targets. A key genomic difference between C. albicans and its less-virulent relatives, including its closest relative C. dubliniensis, is the unique expansion of the TLO (TeLOmere-associated gene family in C. albicans. Individual Tlo proteins are fungal-specific subunits of Mediator, a large multi-subunit eukaryotic transcriptional co-activator complex. This amplification results in a large pool of 'free,' non-Mediator associated, Tlo protein present in C. albicans, but not in C. dubliniensis or other ascomycetes with attenuated virulence. We show that engineering a large 'free' pool of the C. dubliniensis Tlo2 (CdTlo2 protein in C. dubliniensis, through overexpression, results in a number of filamentation phenotypes typically associated only with C. albicans. The amplitude of these phenotypes is proportional to the amount of overexpressed CdTlo2 protein. Overexpression of other C. dubliniensis and C. albicans Tlo proteins do result in these phenotypes. Tlo proteins and their orthologs contain a Mediator interaction domain, and a potent transcriptional activation domain. Nuclear localization of the CdTlo2 activation domain, facilitated naturally by the Tlo Mediator binding domain or artificially through an appended nuclear localization signal, is sufficient for the CdTlo2 overexpression phenotypes. A C. albicans med3 null mutant causes multiple defects including the inability to localize Tlo proteins to the nucleus and reduced virulence in a murine systemic infection model. Our data supports a model in which

  10. Validation of reference genes for quantifying changes in gene expression in virus-infected tobacco.

    Science.gov (United States)

    Baek, Eseul; Yoon, Ju-Yeon; Palukaitis, Peter

    2017-10-01

    To facilitate quantification of gene expression changes in virus-infected tobacco plants, eight housekeeping genes were evaluated for their stability of expression during infection by one of three systemically-infecting viruses (cucumber mosaic virus, potato virus X, potato virus Y) or a hypersensitive-response-inducing virus (tobacco mosaic virus; TMV) limited to the inoculated leaf. Five reference-gene validation programs were used to establish the order of the most stable genes for the systemically-infecting viruses as ribosomal protein L25 > β-Tubulin > Actin, and the least stable genes Ubiquitin-conjugating enzyme (UCE) genes were EF1α > Cysteine protease > Actin, and the least stable genes were GAPDH genes, three defense responsive genes were examined to compare their relative changes in gene expression caused by each virus. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. New genes as drivers of phenotypic evolution

    Science.gov (United States)

    Chen, Sidi; Krinsky, Benjamin H.; Long, Manyuan

    2014-01-01

    During the course of evolution, genomes acquire novel genetic elements as sources of functional and phenotypic diversity, including new genes that originated in recent evolution. In the past few years, substantial progress has been made in understanding the evolution and phenotypic effects of new genes. In particular, an emerging picture is that new genes, despite being present in the genomes of only a subset of species, can rapidly evolve indispensable roles in fundamental biological processes, including development, reproduction, brain function and behaviour. The molecular underpinnings of how new genes can develop these roles are starting to be characterized. These recent discoveries yield fresh insights into our broad understanding of biological diversity at refined resolution. PMID:23949544

  12. Functionally enigmatic genes: a case study of the brain ignorome.

    Directory of Open Access Journals (Sweden)

    Ashutosh K Pandey

    Full Text Available What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

  13. Functionally enigmatic genes: a case study of the brain ignorome.

    Science.gov (United States)

    Pandey, Ashutosh K; Lu, Lu; Wang, Xusheng; Homayouni, Ramin; Williams, Robert W

    2014-01-01

    What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

  14. Toxins and drug discovery.

    Science.gov (United States)

    Harvey, Alan L

    2014-12-15

    Components from venoms have stimulated many drug discovery projects, with some notable successes. These are briefly reviewed, from captopril to ziconotide. However, there have been many more disappointments on the road from toxin discovery to approval of a new medicine. Drug discovery and development is an inherently risky business, and the main causes of failure during development programmes are outlined in order to highlight steps that might be taken to increase the chances of success with toxin-based drug discovery. These include having a clear focus on unmet therapeutic needs, concentrating on targets that are well-validated in terms of their relevance to the disease in question, making use of phenotypic screening rather than molecular-based assays, and working with development partners with the resources required for the long and expensive development process. Copyright © 2014 The Author. Published by Elsevier Ltd.. All rights reserved.

  15. Regeneration of multiple shoots from transgenic potato events facilitates the recovery of phenotypically normal lines: assessing a cry9Aa2 gene conferring insect resistance

    Directory of Open Access Journals (Sweden)

    Jacobs Jeanne ME

    2011-10-01

    Full Text Available Abstract Background The recovery of high performing transgenic lines in clonal crops is limited by the occurrence of somaclonal variation during the tissue culture phase of transformation. This is usually circumvented by developing large populations of transgenic lines, each derived from the first shoot to regenerate from each transformation event. This study investigates a new strategy of assessing multiple shoots independently regenerated from different transformed cell colonies of potato (Solanum tuberosum L.. Results A modified cry9Aa2 gene, under the transcriptional control of the CaMV 35S promoter, was transformed into four potato cultivars using Agrobacterium-mediated gene transfer using a nptII gene conferring kanamycin resistance as a selectable marker gene. Following gene transfer, 291 transgenic lines were grown in greenhouse experiments to assess somaclonal variation and resistance to potato tuber moth (PTM, Phthorimaea operculella (Zeller. Independently regenerated lines were recovered from many transformed cell colonies and Southern analysis confirmed whether they were derived from the same transformed cell. Multiple lines regenerated from the same transformed cell exhibited a similar response to PTM, but frequently exhibited a markedly different spectrum of somaclonal variation. Conclusions A new strategy for the genetic improvement of clonal crops involves the regeneration and evaluation of multiple shoots from each transformation event to facilitate the recovery of phenotypically normal transgenic lines. Most importantly, regenerated lines exhibiting the phenotypic appearance most similar to the parental cultivar are not necessarily derived from the first shoot regenerated from a transformed cell colony, but can frequently be a later regeneration event.

  16. mHealth Visual Discovery Dashboard.

    Science.gov (United States)

    Fang, Dezhi; Hohman, Fred; Polack, Peter; Sarker, Hillol; Kahng, Minsuk; Sharmin, Moushumi; al'Absi, Mustafa; Chau, Duen Horng

    2017-09-01

    We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do - in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking.

  17. Comparative Oncogenomics for Peripheral Nerve Sheath Cancer Gene Discovery

    Science.gov (United States)

    2015-06-01

    and MPNSTs by determining whether these same genes are mutated in human tumors. 15. SUBJECT TERMS Nothing listed 16. SECURITY CLASSIFICATION OF: 17...sheath tumour (MPNST). In: Louis DNO, H.;Wiestler,O.D.;Cavenee,W.K., editor. WHO Classification of Tumours of the Central Nervous System. Lyon: IARC...Location Sex Major or Micro WHO Grade H6 DRG Male Major IV H9 Trigeminal ganglion Female Major III H17 Trigeminal ganglion Male Major II H19 Sciatic

  18. Identification of Candidate B-Lymphoma Genes by Cross-Species Gene Expression Profiling

    Science.gov (United States)

    Tompkins, Van S.; Han, Seong-Su; Olivier, Alicia; Syrbu, Sergei; Bair, Thomas; Button, Anna; Jacobus, Laura; Wang, Zebin; Lifton, Samuel; Raychaudhuri, Pradip; Morse, Herbert C.; Weiner, George; Link, Brian; Smith, Brian J.; Janz, Siegfried

    2013-01-01

    Comparative genome-wide expression profiling of malignant tumor counterparts across the human-mouse species barrier has a successful track record as a gene discovery tool in liver, breast, lung, prostate and other cancers, but has been largely neglected in studies on neoplasms of mature B-lymphocytes such as diffuse large B cell lymphoma (DLBCL) and Burkitt lymphoma (BL). We used global gene expression profiles of DLBCL-like tumors that arose spontaneously in Myc-transgenic C57BL/6 mice as a phylogenetically conserved filter for analyzing the human DLBCL transcriptome. The human and mouse lymphomas were found to have 60 concordantly deregulated genes in common, including 8 genes that Cox hazard regression analysis associated with overall survival in a published landmark dataset of DLBCL. Genetic network analysis of the 60 genes followed by biological validation studies indicate FOXM1 as a candidate DLBCL and BL gene, supporting a number of studies contending that FOXM1 is a therapeutic target in mature B cell tumors. Our findings demonstrate the value of the “mouse filter” for genomic studies of human B-lineage neoplasms for which a vast knowledge base already exists. PMID:24130802

  19. Power in GWAS: lifting the curse of the clinical cut-off

    NARCIS (Netherlands)

    van der Sluis, S.; Posthuma, D.; Nivard, M.G.; Verhage, M.; Dolan, C.V.

    2013-01-01

    Although genome-wide association studies (GWAS), in general, facilitated important discovery of new biological knowledge about diseases,1, 2, 3 identified variants for psychiatric disorders explain little variation, and insight into the role of genes in highly heritable psychiatric traits remains

  20. Service discovery at home

    NARCIS (Netherlands)

    Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.

    2003-01-01

    Service discovery is a fairly new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between devices. This paper provides an overview and comparison of several

  1. The Greatest Mathematical Discovery?

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, David H.; Borwein, Jonathan M.

    2010-05-12

    What mathematical discovery more than 1500 years ago: (1) Is one of the greatest, if not the greatest, single discovery in the field of mathematics? (2) Involved three subtle ideas that eluded the greatest minds of antiquity, even geniuses such as Archimedes? (3) Was fiercely resisted in Europe for hundreds of years after its discovery? (4) Even today, in historical treatments of mathematics, is often dismissed with scant mention, or else is ascribed to the wrong source? Answer: Our modern system of positional decimal notation with zero, together with the basic arithmetic computational schemes, which were discovered in India about 500 CE.

  2. A History of the Discovery of Random X Chromosome Inactivation in the Human Female and its Significance

    Directory of Open Access Journals (Sweden)

    Sophia Balderman

    2011-07-01

    Full Text Available Genetic determinants of sex in placental mammals developed by the evolution of primordial autosomes into the male and female sex chromosomes. The Y chromosome determines maleness by the action of the gene SRY, which encodes a protein that initiates a sequence of events prompting the embryonic gonads to develop into testes. The X chromosome in the absence of a Y chromosome results in a female by permitting the conversion of the embryonic gonads into ovaries. We trace the historical progress that resulted in the discovery that one X chromosome in the female is randomly inactivated in early embryogenesis, accomplishing approximate equivalency of X chromosome gene dosage in both sexes. This event results in half of the somatic cells in a tissue containing proteins encoded by the genes of the maternal X chromosome and half having proteins encoded by the genes of the paternal X chromosome, on average, accounting for the phenotype of a female heterozygote with an X chromosome mutation. The hypothesis of X chromosome inactivation as a random event early in embryogenesis was first described as a result of studies of variegated coat color in female mice. Similar results were found in women using the X chromosome-linked gene, glucose-6-phosphate dehydrogenase, studied in red cells. The random inactivation of the X chromosome-bearing genes for isoenzyme types A and B of glucose-6-phosphate dehydrogenase was used to establish the clonal origin of neoplasms in informative women with leiomyomas. Behind these discoveries are the stories of the men and women scientists whose research enlightened these aspects of X chromosome function and their implication for medicine.

  3. A magnetic bead-based ligand binding assay to facilitate human kynurenine 3-monooxygenase drug discovery.

    Science.gov (United States)

    Wilson, Kris; Mole, Damian J; Homer, Natalie Z M; Iredale, John P; Auer, Manfred; Webster, Scott P

    2015-02-01

    Human kynurenine 3-monooxygenase (KMO) is emerging as an important drug target enzyme in a number of inflammatory and neurodegenerative disease states. Recombinant protein production of KMO, and therefore discovery of KMO ligands, is challenging due to a large membrane targeting domain at the C-terminus of the enzyme that causes stability, solubility, and purification difficulties. The purpose of our investigation was to develop a suitable screening method for targeting human KMO and other similarly challenging drug targets. Here, we report the development of a magnetic bead-based binding assay using mass spectrometry detection for human KMO protein. The assay incorporates isolation of FLAG-tagged KMO enzyme on protein A magnetic beads. The protein-bound beads are incubated with potential binding compounds before specific cleavage of the protein-compound complexes from the beads. Mass spectrometry analysis is used to identify the compounds that demonstrate specific binding affinity for the target protein. The technique was validated using known inhibitors of KMO. This assay is a robust alternative to traditional ligand-binding assays for challenging protein targets, and it overcomes specific difficulties associated with isolating human KMO. © 2014 Society for Laboratory Automation and Screening.

  4. Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field.

    Science.gov (United States)

    Wójcikowski, Maciej; Zielenkiewicz, Piotr; Siedlecki, Pawel

    2015-01-01

    There has been huge progress in the open cheminformatics field in both methods and software development. Unfortunately, there has been little effort to unite those methods and software into one package. We here describe the Open Drug Discovery Toolkit (ODDT), which aims to fulfill the need for comprehensive and open source drug discovery software. The Open Drug Discovery Toolkit was developed as a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. ODDT reimplements many state-of-the-art methods, such as machine learning scoring functions (RF-Score and NNScore) and wraps other external software to ease the process of developing CADD pipelines. ODDT is an out-of-the-box solution designed to be easily customizable and extensible. Therefore, users are strongly encouraged to extend it and develop new methods. We here present three use cases for ODDT in common tasks in computer-aided drug discovery. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT's source code, additional examples and documentation are available on GitHub (https://github.com/oddt/oddt).

  5. Discovery and early development of AVI-7537 and AVI-7288 for the treatment of Ebola virus and Marburg virus infections.

    Science.gov (United States)

    Iversen, Patrick L; Warren, Travis K; Wells, Jay B; Garza, Nicole L; Mourich, Dan V; Welch, Lisa S; Panchal, Rekha G; Bavari, Sina

    2012-11-06

    There are no currently approved treatments for filovirus infections. In this study we report the discovery process which led to the development of antisense Phosphorodiamidate Morpholino Oligomers (PMOs) AVI-6002 (composed of AVI-7357 and AVI-7539) and AVI-6003 (composed of AVI-7287 and AVI-7288) targeting Ebola virus and Marburg virus respectively. The discovery process involved identification of optimal transcript binding sites for PMO based RNA-therapeutics followed by screening for effective viral gene target in mouse and guinea pig models utilizing adapted viral isolates. An evolution of chemical modifications were tested, beginning with simple Phosphorodiamidate Morpholino Oligomers (PMO) transitioning to cell penetrating peptide conjugated PMOs (PPMO) and ending with PMOplus containing a limited number of positively charged linkages in the PMO structure. The initial lead compounds were combinations of two agents targeting separate genes. In the final analysis, a single agent for treatment of each virus was selected, AVI-7537 targeting the VP24 gene of Ebola virus and AVI-7288 targeting NP of Marburg virus, and are now progressing into late stage clinical development as the optimal therapeutic candidates.

  6. Discovery, characterization and expression of a novel zebrafish gene, znfr, important for notochord formation.

    Science.gov (United States)

    Xu, Yan; Zou, Peng; Liu, Yao; Deng, Fengjiao

    2010-06-01

    Genes specifically expressed in the notochord may be crucial for proper notochord development. Using the digital differential display program offered by the National Center for Biotechnology Information, we identified a novel EST sequence from a zebrafish ovary library (No. XM_701450). The full-length cDNA of this transcript was cloned by performing 3' and 5'-RACE and was further confirmed by PCR and sequencing. The resulting 614 bp gene was found to encode a novel 94 amino acid protein that did not share significant homology with any other known protein. Characterization of the genomic sequence revealed that the gene spanned 4.9 kb and was composed of four exons and three introns. RT-PCR gene expression analysis revealed that our gene of interest was expressed in ovary, kidney, brain, mature oocytes and during the early stages of embryogenesis. During embryonic development, znfr mRNA was found to be expressed in the embryonic shield, chordamesoderm and the vacuolated notochord cells by in situ hybridization. Based on this information, we hypothesize that this novel gene is an important maternal factor required for zebrafish notochord formation during early embryonic development. We have thus named this gene znfr (zebrafish notochord formation related).

  7. The in silico drug discovery toolbox: applications in lead discovery and optimization.

    Science.gov (United States)

    Bruno, Agostino; Costantino, Gabriele; Sartori, Luca; Radi, Marco

    2017-11-06

    Discovery and development of a new drug is a long lasting and expensive journey that takes around 15 years from starting idea to approval and marketing of new medication. Despite the R&D expenditures have been constantly increasing in the last few years, number of new drugs introduced into market has been steadily declining. This is mainly due to preclinical and clinical safety issues, which still represent about 40% of drug discontinuation. From this point of view, it is clear that if we want to increase drug-discovery success rate and reduce costs associated with development of a new drug, a comprehensive evaluation/prediction of potential safety issues should be conducted as soon as possible during early drug discovery phase. In the present review, we will analyse the early steps of drug-discovery pipeline, describing the sequence of steps from disease selection to lead optimization and focusing on the most common in silico tools used to assess attrition risks and build a mitigation plan. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  8. Deletion of a regulatory gene within the cpk gene cluster reveals novel antibacterial activity in Streptomyces coelicolor A3(2)

    NARCIS (Netherlands)

    Gottelt, Marco; Kol, Stefan; Gomez-Escribano, Juan Pablo; Bibb, Mervyn; Takano, Eriko

    Genome sequencing of Streptomyces coelicolor A3(2) revealed an uncharacterized type I polyketide synthase gene cluster (cpk) Here we describe the discovery of a novel antibacterial activity (abCPK) and a yellow-pigmented secondary metabolite (yCPK) after deleting a presumed pathway-specific

  9. Discovery Strategies of Bioactive Compounds Synthesized by Nonribosomal Peptide Synthetases and Type-I Polyketide Synthases Derived from Marine Microbiomes

    Science.gov (United States)

    Amoutzias, Grigoris D.; Chaliotis, Anargyros; Mossialos, Dimitris

    2016-01-01

    Considering that 70% of our planet’s surface is covered by oceans, it is likely that undiscovered biodiversity is still enormous. A large portion of marine biodiversity consists of microbiomes. They are very attractive targets of bioprospecting because they are able to produce a vast repertoire of secondary metabolites in order to adapt in diverse environments. In many cases secondary metabolites of pharmaceutical and biotechnological interest such as nonribosomal peptides (NRPs) and polyketides (PKs) are synthesized by multimodular enzymes named nonribosomal peptide synthetases (NRPSes) and type-I polyketide synthases (PKSes-I), respectively. Novel findings regarding the mechanisms underlying NRPS and PKS evolution demonstrate how microorganisms could leverage their metabolic potential. Moreover, these findings could facilitate synthetic biology approaches leading to novel bioactive compounds. Ongoing advances in bioinformatics and next-generation sequencing (NGS) technologies are driving the discovery of NRPs and PKs derived from marine microbiomes mainly through two strategies: genome-mining and metagenomics. Microbial genomes are now sequenced at an unprecedented rate and this vast quantity of biological information can be analyzed through genome mining in order to identify gene clusters encoding NRPSes and PKSes of interest. On the other hand, metagenomics is a fast-growing research field which directly studies microbial genomes and their products present in marine environments using culture-independent approaches. The aim of this review is to examine recent developments regarding discovery strategies of bioactive compounds synthesized by NRPS and type-I PKS derived from marine microbiomes and to highlight the vast diversity of NRPSes and PKSes present in marine environments by giving examples of recently discovered bioactive compounds. PMID:27092515

  10. Discovery Strategies of Bioactive Compounds Synthesized by Nonribosomal Peptide Synthetases and Type-I Polyketide Synthases Derived from Marine Microbiomes

    Directory of Open Access Journals (Sweden)

    Grigoris D. Amoutzias

    2016-04-01

    Full Text Available Considering that 70% of our planet’s surface is covered by oceans, it is likely that undiscovered biodiversity is still enormous. A large portion of marine biodiversity consists of microbiomes. They are very attractive targets of bioprospecting because they are able to produce a vast repertoire of secondary metabolites in order to adapt in diverse environments. In many cases secondary metabolites of pharmaceutical and biotechnological interest such as nonribosomal peptides (NRPs and polyketides (PKs are synthesized by multimodular enzymes named nonribosomal peptide synthetases (NRPSes and type-I polyketide synthases (PKSes-I, respectively. Novel findings regarding the mechanisms underlying NRPS and PKS evolution demonstrate how microorganisms could leverage their metabolic potential. Moreover, these findings could facilitate synthetic biology approaches leading to novel bioactive compounds. Ongoing advances in bioinformatics and next-generation sequencing (NGS technologies are driving the discovery of NRPs and PKs derived from marine microbiomes mainly through two strategies: genome-mining and metagenomics. Microbial genomes are now sequenced at an unprecedented rate and this vast quantity of biological information can be analyzed through genome mining in order to identify gene clusters encoding NRPSes and PKSes of interest. On the other hand, metagenomics is a fast-growing research field which directly studies microbial genomes and their products present in marine environments using culture-independent approaches. The aim of this review is to examine recent developments regarding discovery strategies of bioactive compounds synthesized by NRPS and type-I PKS derived from marine microbiomes and to highlight the vast diversity of NRPSes and PKSes present in marine environments by giving examples of recently discovered bioactive compounds.

  11. Academic Drug Discovery Centres

    DEFF Research Database (Denmark)

    Kirkegaard, Henriette Schultz; Valentin, Finn

    2014-01-01

    Academic drug discovery centres (ADDCs) are seen as one of the solutions to fill the innovation gap in early drug discovery, which has proven challenging for previous organisational models. Prior studies of ADDCs have identified the need to analyse them from the angle of their economic...

  12. Service Discovery At Home

    NARCIS (Netherlands)

    Sundramoorthy, V.; Scholten, Johan; Jansen, P.G.; Hartel, Pieter H.

    Service discovery is a fady new field that kicked off since the advent of ubiquitous computing and has been found essential in the making of intelligent networks by implementing automated discovery and remote control between deviies. This paper provides an ovewiew and comparison of several prominent

  13. Discovery and characterization of novel vascular and hematopoietic genes downstream of etsrp in zebrafish.

    Directory of Open Access Journals (Sweden)

    Gustavo A Gomez

    Full Text Available The transcription factor Etsrp is required for vasculogenesis and primitive myelopoiesis in zebrafish. When ectopically expressed, etsrp is sufficient to induce the expression of many vascular and myeloid genes in zebrafish. The mammalian homolog of etsrp, ER71/Etv2, is also essential for vascular and hematopoietic development. To identify genes downstream of etsrp, gain-of-function experiments were performed for etsrp in zebrafish embryos followed by transcription profile analysis by microarray. Subsequent in vivo expression studies resulted in the identification of fourteen genes with blood and/or vascular expression, six of these being completely novel. Regulation of these genes by etsrp was confirmed by ectopic induction in etsrp overexpressing embryos and decreased expression in etsrp deficient embryos. Additional functional analysis of two newly discovered genes, hapln1b and sh3gl3, demonstrates their importance in embryonic vascular development. The results described here identify a group of genes downstream of etsrp likely to be critical for vascular and/or myeloid development.

  14. Recent discoveries in the molecular pathogenesis of the inherited bone marrow failure syndrome Fanconi anemia.

    Science.gov (United States)

    Mamrak, Nicholas E; Shimamura, Akiko; Howlett, Niall G

    2017-05-01

    Fanconi anemia (FA) is a rare autosomal and X-linked genetic disease characterized by congenital abnormalities, progressive bone marrow failure (BMF), and increased cancer risk during early adulthood. The median lifespan for FA patients is approximately 33years. The proteins encoded by the FA genes function together in the FA-BRCA pathway to repair DNA damage and to maintain genome stability. Within the past two years, five new FA genes have been identified-RAD51/FANCR, BRCA1/FANCS, UBE2T/FANCT, XRCC2/FANCU, and REV7/FANCV-bringing the total number of disease-causing genes to 21. This review summarizes the discovery of these new FA genes and describes how these proteins integrate into the FA-BRCA pathway to maintain genome stability and critically prevent early-onset BMF and cancer. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Detection of biosurfactants in Bacillus species: genes and products identification.

    Science.gov (United States)

    Płaza, G; Chojniak, J; Rudnicka, K; Paraszkiewicz, K; Bernat, P

    2015-10-01

    To screen environmental Bacillus strains for detection of genes encoding the enzymes involved in biosurfactant synthesis and to evaluate their products e.g. surfactin, iturin and fengycin. The taxonomic identification of isolated from the environment Bacillus strains was performed by Microgene ID Bacillus panel and GEN III Biolog system. The polymerase chain reaction (PCR) strategy for screening of genes in Bacillus strains was set up. Liquid chromatography-mass spectrometry (LC-MS/MS) method was used for the identification of lipopeptides (LPs). All studied strains exhibited the presence of srfAA gene and produced surfactin mostly as four homologues (C13 to C16). Moreover, in 2 strains (KP7, T'-1) simultaneous co-production of 3 biosurfactants: surfactin, iturin and fengycin was observed. Additionally, it was found out that isolate identified as Bacillus subtilis ssp. subtilis (KP7), beside LPs co-production, synthesizes surfactin with the efficiency much higher than other studied strains (40·2 mg l(-1) ) and with the yield ranging from 0·8 to 8·3 mg l(-1) . We showed that the combined methodology based on PCR and LC-MS/MS technique is an optimal tool for the detection of genes encoding enzymes involved in biosurfactant synthesis as well as their products, e.g. surfactin, iturin and fengycin. This approach improves the screening and the identification of environmental Bacillus co-producing biosurfactants-stimulating and facilitating the development of this area of science. The findings of this work will help to improve screening of biosurfactant producers. Discovery of novel biosurfactants and biosurfactants co-production ability has shed light on their new application fields and for the understanding of their interactions and properties. © 2015 The Society for Applied Microbiology.

  16. Synthetic biology of antimicrobial discovery.

    Science.gov (United States)

    Zakeri, Bijan; Lu, Timothy K

    2013-07-19

    Antibiotic discovery has a storied history. From the discovery of penicillin by Sir Alexander Fleming to the relentless quest for antibiotics by Selman Waksman, the stories have become like folklore used to inspire future generations of scientists. However, recent discovery pipelines have run dry at a time when multidrug-resistant pathogens are on the rise. Nature has proven to be a valuable reservoir of antimicrobial agents, which are primarily produced by modularized biochemical pathways. Such modularization is well suited to remodeling by an interdisciplinary approach that spans science and engineering. Herein, we discuss the biological engineering of small molecules, peptides, and non-traditional antimicrobials and provide an overview of the growing applicability of synthetic biology to antimicrobials discovery.

  17. β-thalassemias: paradigmatic diseases for scientific discoveries and development of innovative therapies.

    Science.gov (United States)

    Rivella, Stefano

    2015-04-01

    β-thalassemias are monogenic disorders characterized by defective synthesis of the β-globin chain, one of the major components of adult hemoglobin. A large number of mutations in the β-globin gene or its regulatory elements have been associated with β-thalassemias. Due to the complexity of the regulation of the β-globin gene and the role of red cells in many physiological processes, patients can manifest a large spectrum of phenotypes, and clinical requirements vary from patient to patient. It is important to consider the major differences in the light of potential novel therapeutics. This review summarizes the main discoveries and mechanisms associated with the synthesis of β-globin and abnormal erythropoiesis, as well as current and novel therapies. Copyright© Ferrata Storti Foundation.

  18. Discovery of rare variants via sequencing: implications for the design of complex trait association studies.

    Directory of Open Access Journals (Sweden)

    Bingshan Li

    2009-05-01

    Full Text Available There is strong evidence that rare variants are involved in complex disease etiology. The first step in implicating rare variants in disease etiology is their identification through sequencing in both randomly ascertained samples (e.g., the 1,000 Genomes Project and samples ascertained according to disease status. We investigated to what extent rare variants will be observed across the genome and in candidate genes in randomly ascertained samples, the magnitude of variant enrichment in diseased individuals, and biases that can occur due to how variants are discovered. Although sequencing cases can enrich for casual variants, when a gene or genes are not involved in disease etiology, limiting variant discovery to cases can lead to association studies with dramatically inflated false positive rates.

  19. A procedure to improve the information flow in the assessment of discoveries of oil and gas resources in the Brazilian context

    Energy Technology Data Exchange (ETDEWEB)

    Rosa, Henrique; Suslick, Saul B.; Sousa, Sergio H.G. de [Universidade Estadual de Campinas, SP (Brazil). Inst. of Geosciences; Castro, Jonas Q. [ANP - Brazilian National Petroleum Agency, Rio de Janeiro, RJ (Brazil)

    2004-07-01

    This paper is focused on the elaboration of a standardization model for the existing flow of information between the Petroleum National Agency (ANP) and the concessionaire companies in the event of the discovery of any potentially commercial hydrocarbon resources inside their concession areas. The method proposed by Rosa (2003) included the analysis of a small sample of Oil and Gas Discovery Assessment Plans (PADs), elaborated by companies that operate in exploratory blocks in Brazil, under the regulatory context introduced by the Petroleum Law (Law 9478, August, 6th, 1997). The analysis of these documents made it possible to identify and target the problems originated from the lack of standardization. The results obtained facilitated the development of a model that helps the creation process of Oil and Gas Discovery Assessment Plans. It turns out that the standardization procedures suggested provide considerable advantages while speeding up several technical and regulatory steps. A software called 'ePADs' was developed to consolidate the automation of the several steps in the model for the standardization of the Oil and Gas Discovery Assessment Plans. A preliminary version has been tested with several different types of discoveries indicating a good performance by complying with all regulatory aspects and operational requirements. (author)

  20. Discovery of new enzymes and metabolic pathways using structure and genome context

    Science.gov (United States)

    Zhao, Suwen; Kumar, Ritesh; Sakai, Ayano; Vetting, Matthew W.; Wood, B. McKay; Brown, Shoshana; Bonanno, Jeffery B.; Hillerich, Brandan S.; Seidel, Ronald D.; Babbitt, Patricia C.; Almo, Steven C.; Sweedler, Jonathan V.; Gerlt, John A.; Cronan, John E.; Jacobson, Matthew P.

    2014-01-01

    Assigning valid functions to proteins identified in genome projects is challenging, with over-prediction and database annotation errors major concerns1. We, and others2, are developing computation-guided strategies for functional discovery using “metabolite docking” to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by “genome neighborhoods” (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by “predicting” the intermediates in the glycolytic pathway in E. coli5. Metabolite docking to multiple binding proteins/enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. We report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed i) the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and ii) the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guide functional predictions to enable the discovery of new metabolic pathways. PMID:24056934