WorldWideScience

Sample records for based gene discovery

  1. Canonical correlation analysis for gene-based pleiotropy discovery.

    Directory of Open Access Journals (Sweden)

    Jose A Seoane

    2014-10-01

    Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

  2. Gene-disease relationship discovery based on model-driven data integration and database view definition.

    Science.gov (United States)

    Yilmaz, S; Jonveaux, P; Bicep, C; Pierron, L; Smaïl-Tabbone, M; Devignes, M D

    2009-01-15

    Computational methods are widely used to discover gene-disease relationships hidden in vast masses of available genomic and post-genomic data. In most current methods, a similarity measure is calculated between gene annotations and known disease genes or disease descriptions. However, more explicit gene-disease relationships are required for better insights into the molecular bases of diseases, especially for complex multi-gene diseases. Explicit relationships between genes and diseases are formulated as candidate gene definitions that may include intermediary genes, e.g. orthologous or interacting genes. These definitions guide data modelling in our database approach for gene-disease relationship discovery and are expressed as views which ultimately lead to the retrieval of documented sets of candidate genes. A system called ACGR (Approach for Candidate Gene Retrieval) has been implemented and tested with three case studies including a rare orphan gene disease.

  3. Gene-based SNP discovery and genetic mapping in pea.

    Science.gov (United States)

    Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G

    2014-10-01

    Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.

  4. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  5. Literature-Based Discovery of IFN-γ and Vaccine-Mediated Gene Interaction Networks

    Directory of Open Access Journals (Sweden)

    Arzucan Özgür

    2010-01-01

    Full Text Available Interferon-gamma (IFN-γ regulates various immune responses that are often critical for vaccine-induced protection. In order to annotate the IFN-γ-related gene interaction network from a large amount of IFN-γ research reported in the literature, a literature-based discovery approach was applied with a combination of natural language processing (NLP and network centrality analysis. The interaction network of human IFN-γ (Gene symbol: IFNG and its vaccine-specific subnetwork were automatically extracted using abstracts from all articles in PubMed. Four network centrality metrics were further calculated to rank the genes in the constructed networks. The resulting generic IFNG network contains 1060 genes and 26313 interactions among these genes. The vaccine-specific subnetwork contains 102 genes and 154 interactions. Fifty six genes such as TNF, NFKB1, IL2, IL6, and MAPK8 were ranked among the top 25 by at least one of the centrality methods in one or both networks. Gene enrichment analysis indicated that these genes were classified in various immune mechanisms such as response to extracellular stimulus, lymphocyte activation, and regulation of apoptosis. Literature evidence was manually curated for the IFN-γ relatedness of 56 genes and vaccine development relatedness for 52 genes. This study also generated many new hypotheses worth further experimental studies.

  6. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  7. Rice Genomics: Gene discovery

    Indian Academy of Sciences (India)

    There is a need for discovering candidate genes( a lot of them all over the genome indeed ) and the unlimited allelic variation that can productively take over rice metabolism when cellular water content falls below threshold levels.

  8. ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis

    Directory of Open Access Journals (Sweden)

    Saurav Mallik

    2017-12-01

    Full Text Available For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures—weighted rank-based Jaccard and Cosine measures—and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm—RANWAR—was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.

  9. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea.

    Science.gov (United States)

    Coetzer, Nanette; Gazendam, Inge; Oelofse, Dean; Berger, Dave K

    2010-04-01

    Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and illustrated that the

  10. Independent Gene Discovery and Testing

    Science.gov (United States)

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  11. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Directory of Open Access Journals (Sweden)

    Oelofse Dean

    2010-04-01

    Full Text Available Abstract Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L. Walp. We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i to normalize the data effectively using spike-in control spot normalization, and (ii to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped

  12. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Science.gov (United States)

    2010-01-01

    Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and

  13. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  14. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    Science.gov (United States)

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.

  15. Gene discovery in Triatoma infestans

    Directory of Open Access Journals (Sweden)

    de Burgos Nelia

    2011-03-01

    Full Text Available Abstract Background Triatoma infestans is the most relevant vector of Chagas disease in the southern cone of South America. Since its genome has not yet been studied, sequencing of Expressed Sequence Tags (ESTs is one of the most powerful tools for efficiently identifying large numbers of expressed genes in this insect vector. Results In this work, we generated 826 ESTs, resulting in an increase of 47% in the number of ESTs available for T. infestans. These ESTs were assembled in 471 unique sequences, 151 of which represent 136 new genes for the Reduviidae family. Conclusions Among the putative new genes for the Reduviidae family, we identified and described an interesting subset of genes involved in development and reproduction, which constitute potential targets for insecticide development.

  16. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to

  17. GWATCH: a web platform for automated gene association discovery analysis

    Science.gov (United States)

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  18. Biomarker Gene Signature Discovery Integrating Network Knowledge

    Directory of Open Access Journals (Sweden)

    Holger Fröhlich

    2012-02-01

    Full Text Available Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches.

  19. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    Science.gov (United States)

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  20. A PKS I gene-based screening approach for the discovery of a new polyketide from Penicillium citrinum Salicorn 46.

    Science.gov (United States)

    Wang, Xiaomin; Wang, Hui; Liu, Tianxing; Xin, Zhihong

    2014-06-01

    Salicorn 46, an endophytic fungus isolated from Salicornia herbacea Torr., was identified as Penicillium citrinum based on its internal transcribed spacer and ribosomal large-subunit DNA sequences using a type I polyketide synthase (PKS I) gene screening approach. A new polyketide, penicitriketo (1), and seven known compounds, including ergone (2), (3β,5α,8α,22E)-5,8-epidioxyergosta-6,9,22-trien-3-ol (3), (3β,5α,8α,22E)-5,8-epidioxyergosta-6,22-dien-3-ol (4), stigmasta-7,22-diene-3β,5α,6α-triol (5), 3β,5α-dihydroxy-(22E,24R)-ergosta-7,22-dien-6β-yl oleate (6), N b-acetyltryptamine (7), and 2-(1-oxo-2-hydroxyethyl) furan (8), were isolated from the culture of Salicorn 46, and their chemical structures were elucidated by spectroscopic analysis. Antioxidant experiments revealed that compound 1 possessed moderate DPPH radical scavenging activity with an IC50 value of 85.33 ± 1.61 μM. Antimicrobial assays revealed that compound 2 exhibited broad-spectrum antimicrobial activity against Candida albicans, Clostridium perfringens, Mycobacterium smegmatis, and Mycobacterium phlei with minimal inhibitory concentration (MIC) values of 25.5, 25.5, 18.5, and 51.0 μM, respectively. Compound 3 displayed potent antimicrobial activities against C. perfringens and Micrococcus tetragenus with a MIC value of 23.5 μM. Compounds 5 and 6 showed high levels of selectivity toward Bacillus subtilis and M. phlei with MIC values of 22.5 and 14.4 μM, respectively. The results of this study highlight the use of PCR-based techniques for the screening of new polyketides from endophytic fungi containing PKS I genes.

  1. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  2. Microarray Assisted Gene Discovery in Ulcerative Colitis

    DEFF Research Database (Denmark)

    Brusgaard, Klaus

    ), and microarray based expression studies. In IBD the increased production of chemo attractants from the inflamed microenvironment results in recruitment of activated CD4+ T lymphocytes which results in tissue damage. Where Th1 cell-derived cytokines has been reported to be essential mediators in CD with high (IFN...... on the activation of different downstream pathways. Thus it seems that different genetic backgrounds can lead to similar clinical manifestations, and as well determines the susceptibility to IBD. In the previous micro array based expression studies on UC the main target has been to point to new candidate genes...... based on analysis of the main up or down regulated genes in the dataset. The majority of the studies are hampered by a relatively shortcoming of the numbers of genes analysed on the particular array. In this study the main target has been to point to clusters of genes involved in biochemical pathways...

  3. In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles.

    Science.gov (United States)

    Lin, Frank P Y; Coiera, Enrico; Lan, Ruiting; Sintchenko, Vitali

    2009-03-17

    In silico candidate gene prioritisation (CGP) aids the discovery of gene functions by ranking genes according to an objective relevance score. While several CGP methods have been described for identifying human disease genes, corresponding methods for prokaryotic gene function discovery are lacking. Here we present two prokaryotic CGP methods, based on phylogenetic profiles, to assist with this task. Using gene occurrence patterns in sample genomes, we developed two CGP methods (statistical and inductive CGP) to assist with the discovery of bacterial gene functions. Statistical CGP exploits the differences in gene frequency against phenotypic groups, while inductive CGP applies supervised machine learning to identify gene occurrence pattern across genomes. Three rediscovery experiments were designed to evaluate the CGP frameworks. The first experiment attempted to rediscover peptidoglycan genes with 417 published genome sequences. Both CGP methods achieved best areas under receiver operating characteristic curve (AUC) of 0.911 in Escherichia coli K-12 (EC-K12) and 0.978 Streptococcus agalactiae 2603 (SA-2603) genomes, with an average improvement in precision of >3.2-fold and a maximum of >27-fold using statistical CGP. A median AUC of >0.95 could still be achieved with as few as 10 genome examples in each group of genome examples in the rediscovery of the peptidoglycan metabolism genes. In the second experiment, a maximum of 109-fold improvement in precision was achieved in the rediscovery of anaerobic fermentation genes in EC-K12. The last experiment attempted to rediscover genes from 31 metabolic pathways in SA-2603, where 14 pathways achieved AUC >0.9 and 28 pathways achieved AUC >0.8 with the best inductive CGP algorithms. Our results demonstrate that the two CGP methods can assist with the study of functionally uncategorised genomic regions and discovery of bacterial gene-function relationships. Our rediscovery experiments also provide a set of standard tasks

  4. In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles

    Directory of Open Access Journals (Sweden)

    Lan Ruiting

    2009-03-01

    Full Text Available Abstract Background In silico candidate gene prioritisation (CGP aids the discovery of gene functions by ranking genes according to an objective relevance score. While several CGP methods have been described for identifying human disease genes, corresponding methods for prokaryotic gene function discovery are lacking. Here we present two prokaryotic CGP methods, based on phylogenetic profiles, to assist with this task. Results Using gene occurrence patterns in sample genomes, we developed two CGP methods (statistical and inductive CGP to assist with the discovery of bacterial gene functions. Statistical CGP exploits the differences in gene frequency against phenotypic groups, while inductive CGP applies supervised machine learning to identify gene occurrence pattern across genomes. Three rediscovery experiments were designed to evaluate the CGP frameworks. The first experiment attempted to rediscover peptidoglycan genes with 417 published genome sequences. Both CGP methods achieved best areas under receiver operating characteristic curve (AUC of 0.911 in Escherichia coli K-12 (EC-K12 and 0.978 Streptococcus agalactiae 2603 (SA-2603 genomes, with an average improvement in precision of >3.2-fold and a maximum of >27-fold using statistical CGP. A median AUC of >0.95 could still be achieved with as few as 10 genome examples in each group of genome examples in the rediscovery of the peptidoglycan metabolism genes. In the second experiment, a maximum of 109-fold improvement in precision was achieved in the rediscovery of anaerobic fermentation genes in EC-K12. The last experiment attempted to rediscover genes from 31 metabolic pathways in SA-2603, where 14 pathways achieved AUC >0.9 and 28 pathways achieved AUC >0.8 with the best inductive CGP algorithms. Conclusion Our results demonstrate that the two CGP methods can assist with the study of functionally uncategorised genomic regions and discovery of bacterial gene-function relationships. Our

  5. Developing integrated crop knowledge networks to advance candidate gene discovery.

    Science.gov (United States)

    Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

    2016-12-01

    The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.

  6. Gene-based therapies of neuromuscular disorders: an update and the pivotal role of patient organizations in their discovery and implementation.

    Science.gov (United States)

    Braun, Serge

    2013-01-01

    This review updates the state-of-the art accomplishments of the multifaceted gene-based therapies, which include DNA or RNA as either therapeutic tools or targets for the treatment of neuromuscular diseases. It also provides insights into the key role that patient organizations have played in research and development; in particular, by addressing bottlenecks and generating boundary conditions that have contributed to scientific breakthroughs, and the effectiveness of innovation processes. Several gene therapy methods have reached the clinical stage and are now addressing both specific and classical issues related to this novel technology. Not ready yet for clinical application, genome editing is at its infancy. More rapidly progressing, RNA-based therapeutics, and especially exon skipping, exon inclusion and stop codon readthrough strategies, are about to move to the market. Most importantly, patients were at the forefront of this discovery process, from basic knowledge to innovation and translational research in a rapidly growing field of unmet medical needs. In recent years, Duchenne muscular dystrophy was the fertile ground for new therapeutic concepts that have been extended to other neuromuscular disorders, such as spinal muscular atrophy, myotonic dystrophies or fascioscapulohumeral dystrophy. In line with their longstanding policy, patient organizations will keep working in a proactive manner to bring together all stakeholders with a view to working out truly therapeutic solutions over a long-term perspective. Copyright © 2013 John Wiley & Sons, Ltd.

  7. Crowdsourcing the nodulation gene network discovery environment.

    Science.gov (United States)

    Li, Yupeng; Jackson, Scott A

    2016-05-26

    The Legumes (Fabaceae) are an economically and ecologically important group of plant species with the conspicuous capacity for symbiotic nitrogen fixation in root nodules, specialized plant organs containing symbiotic microbes. With the aim of understanding the underlying molecular mechanisms leading to nodulation, many efforts are underway to identify nodulation-related genes and determine how these genes interact with each other. In order to accurately and efficiently reconstruct nodulation gene network, a crowdsourcing platform, CrowdNodNet, was created. The platform implements the jQuery and vis.js JavaScript libraries, so that users are able to interactively visualize and edit the gene network, and easily access the information about the network, e.g. gene lists, gene interactions and gene functional annotations. In addition, all the gene information is written on MediaWiki pages, enabling users to edit and contribute to the network curation. Utilizing the continuously updated, collaboratively written, and community-reviewed Wikipedia model, the platform could, in a short time, become a comprehensive knowledge base of nodulation-related pathways. The platform could also be used for other biological processes, and thus has great potential for integrating and advancing our understanding of the functional genomics and systems biology of any process for any species. The platform is available at http://crowd.bioops.info/ , and the source code can be openly accessed at https://github.com/bioops/crowdnodnet under MIT License.

  8. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  9. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource.

    Science.gov (United States)

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-11-16

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Automated discovery of functional generality of human gene expression programs.

    Directory of Open Access Journals (Sweden)

    Georg K Gerber

    2007-08-01

    Full Text Available An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-kappaB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal

  11. Discovery of cancer common and specific driver gene sets

    Science.gov (United States)

    2017-01-01

    Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295

  12. Genome-enabled Discovery of Carbon Sequestration Genes

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Tschaplinski, Timothy J [ORNL; Kalluri, Udaya C [ORNL; Yin, Tongming [ORNL; Yang, Xiaohan [ORNL; Zhang, Xinye [ORNL; Engle, Nancy L [ORNL; Ranjan, Priya [ORNL; Basu, Manojit M [ORNL; Gunter, Lee E [ORNL; Jawdy, Sara [ORNL; Martin, Madhavi Z [ORNL; Campbell, Alina S [ORNL; DiFazio, Stephen P [ORNL; Davis, John M [University of Florida; Hinchee, Maud [ORNL; Pinnacchio, Christa [U.S. Department of Energy, Joint Genome Institute; Meilan, R [Purdue University; Busov, V. [Michigan Technological University; Strauss, S [Oregon State University

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  13. Prostate Cancer Gene Discovery Using ROMA

    National Research Council Canada - National Science Library

    Isaacs, William B

    2007-01-01

    We hypothesized that a subset of men who develop prostate cancer (PCa) do so as a result of an inherited chromosomal deletion or amplification affecting the function of one or more critical prostate cancer susceptibility genes...

  14. SNP marker discovery in koala TLR genes.

    Directory of Open Access Journals (Sweden)

    Jian Cui

    Full Text Available Toll-like receptors (TLRs play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.

  15. SNP marker discovery in koala TLR genes.

    Science.gov (United States)

    Cui, Jian; Frankham, Greta J; Johnson, Rebecca N; Polkinghorne, Adam; Timms, Peter; O'Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.

  16. Fusion genes and their discovery using high throughput sequencing.

    Science.gov (United States)

    Annala, M J; Parker, B C; Zhang, W; Nykter, M

    2013-11-01

    Fusion genes are hybrid genes that combine parts of two or more original genes. They can form as a result of chromosomal rearrangements or abnormal transcription, and have been shown to act as drivers of malignant transformation and progression in many human cancers. The biological significance of fusion genes together with their specificity to cancer cells has made them into excellent targets for molecular therapy. Fusion genes are also used as diagnostic and prognostic markers to confirm cancer diagnosis and monitor response to molecular therapies. High-throughput sequencing has enabled the systematic discovery of fusion genes in a wide variety of cancer types. In this review, we describe the history of fusion genes in cancer and the ways in which fusion genes form and affect cellular function. We also describe computational methodologies for detecting fusion genes from high-throughput sequencing experiments, and the most common sources of error that lead to false discovery of fusion genes. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  17. Barbara McClintock and the Discovery of Jumping Genes

    Indian Academy of Sciences (India)

    GENERAL I ARTICLE. Barbara McClintock and the Discovery of Jumping Genes. Vidyanand Nanjundiah works in the. Developmental Biology and Genetics Laboratory at the Indian Institute of. Science. After a Master's degree in physics he took up biology. He is interested in evolutionary biology and pattern formation during.

  18. Functional Principles of Registry-based Service Discovery

    NARCIS (Netherlands)

    Sundramoorthy, V.; Tan, C.; Hartel, Pieter H.; den Hartog, Jeremy; Scholten, Johan

    As Service Discovery Protocols (SDP) are becoming increasingly important for ubiquitous computing, they must behave according to predefined principles. We present the functional Principles of Service Discovery for robust, registry-based service discovery. A methodology to guarantee adherence to

  19. Novel venom gene discovery in the platypus.

    Science.gov (United States)

    Whittington, Camilla M; Papenfuss, Anthony T; Locke, Devin P; Mardis, Elaine R; Wilson, Richard K; Abubucker, Sahar; Mitreva, Makedonka; Wong, Emily S W; Hsu, Arthur L; Kuchel, Philip W; Belov, Katherine; Warren, Wesley C

    2010-01-01

    To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation. This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom.

  20. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.

    Science.gov (United States)

    Zhang, Jin; White, Nicole M; Schmidt, Heather K; Fulton, Robert S; Tomlinson, Chad; Warren, Wesley C; Wilson, Richard K; Maher, Christopher A

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. © 2016 Zhang et al.; Published by Cold Spring Harbor Laboratory Press.

  1. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  2. Mass Spectrometry-Based Biomarker Discovery.

    Science.gov (United States)

    Zhou, Weidong; Petricoin, Emanuel F; Longo, Caterina

    2017-01-01

    The discovery of candidate biomarkers within the entire proteome is one of the most important and challenging goals in proteomic research. Mass spectrometry-based proteomics is a modern and promising technology for semiquantitative and qualitative assessment of proteins, enabling protein sequencing and identification with exquisite accuracy and sensitivity. For mass spectrometry analysis, protein extractions from tissues or body fluids and subsequent protein fractionation represent an important and unavoidable step in the workflow for biomarker discovery. Following extraction of proteins, the protein mixture must be digested, reduced, alkylated, and cleaned up prior to mass spectrometry. The aim of our chapter is to provide comprehensible and practical lab procedures for sample digestion, protein fractionation, and subsequent mass spectrometry analysis.

  3. Discovery of error-tolerant biclusters from noisy gene expression data.

    Science.gov (United States)

    Gupta, Rohit; Rao, Navneet; Kumar, Vipin

    2011-11-24

    An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, which limits its applicability in real-life data sets where the biclusters may be fragmented due to random noise/errors. Moreover, as they only work with binary or boolean attributes, their application on gene-expression data require transforming real-valued attributes to binary attributes, which often results in loss of information. Many past approaches have tried to address the issue of noise and handling real-valued attributes independently but there is no systematic approach that addresses both of these issues together. In this paper, we first propose a novel error-tolerant biclustering model, 'ET-bicluster', and then propose a bottom-up heuristic-based mining algorithm to sequentially discover error-tolerant biclusters directly from real-valued gene-expression data. The efficacy of our proposed approach is illustrated by comparing it with a recent approach RAP in the context of two biological problems: discovery of functional modules and discovery of biomarkers. For the first problem, two real-valued S.Cerevisiae microarray gene-expression data sets are used to demonstrate that the biclusters obtained from ET

  4. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  5. Knowledge Discovery in Literature Data Bases

    Science.gov (United States)

    Albrecht, Rudolf; Merkl, Dieter

    The concept of knowledge discovery as defined through ``establishing previously unknown and unsuspected relations of features in a data base'' is, cum grano salis, relatively easy to implement for data bases containing numerical data. Increasingly we find at our disposal data bases containing scientific literature. Computer assisted detection of unknown relations of features in such data bases would be extremely valuable and would lead to new scientific insights. However, the current representation of scientific knowledge in such data bases is not conducive to computer processing. Any correlation of features still has to be done by the human reader, a process which is plagued by ineffectiveness and incompleteness. On the other hand we note that considerable progress is being made in an area where reading all available material is totally prohibitive: the World Wide Web. Robots and Web crawlers mine the Web continuously and construct data bases which allow the identification of pages of interest in near real time. An obvious step is to categorize and classify the documents in the text data base. This can be used to identify papers worth reading, or which are of unexpected cross-relevance. We show the results of first experiments using unsupervised classification based on neural networks.

  6. Graph-Based Methods for Discovery Browsing with Semantic Predications

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej; Fiszman, Marcelo; Miller, Christopher M

    2011-01-01

    We present an extension to literature-based discovery that goes beyond making discoveries to a principled way of navigating through selected aspects of some biomedical domain. The method is a type of "discovery browsing" that guides the user through the research literature on a specified phenomen...

  7. The Matchmaker Exchange: a platform for rare disease gene discovery.

    Science.gov (United States)

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow. © 2015 WILEY PERIODICALS, INC.

  8. Amyotrophic lateral sclerosis: an emerging era of collaborative gene discovery.

    Directory of Open Access Journals (Sweden)

    Katrina Gwinn

    2007-12-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is the most common form of motor neuron disease (MND. It is currently incurable and treatment is largely limited to supportive care. Family history is associated with an increased risk of ALS, and many Mendelian causes have been discovered. However, most forms of the disease are not obviously familial. Recent advances in human genetics have enabled genome-wide analyses of single nucleotide polymorphisms (SNPs that make it possible to study complex genetic contributions to human disease. Genome-wide SNP analyses require a large sample size and thus depend upon collaborative efforts to collect and manage the biological samples and corresponding data. Public availability of biological samples (such as DNA, phenotypic and genotypic data further enhances research endeavors. Here we discuss a large collaboration among academic investigators, government, and non-government organizations which has created a public repository of human DNA, immortalized cell lines, and clinical data to further gene discovery in ALS. This resource currently maintains samples and associated phenotypic data from 2332 MND subjects and 4692 controls. This resource should facilitate genetic discoveries which we anticipate will ultimately provide a better understanding of the biological mechanisms of neurodegeneration in ALS.

  9. Gene signature critical to cancer phenotype as a paradigm for anticancer drug discovery.

    Science.gov (United States)

    Sampson, E R; McMurray, H R; Hassane, D C; Newman, L; Salzman, P; Jordan, C T; Land, H

    2013-08-15

    Malignant cell transformation commonly results in the deregulation of thousands of cellular genes, an observation that suggests a complex biological process and an inherently challenging scenario for the development of effective cancer interventions. To better define the genes/pathways essential to regulating the malignant phenotype, we recently described a novel strategy based on the cooperative nature of carcinogenesis that focuses on genes synergistically deregulated in response to cooperating oncogenic mutations. These so-called 'cooperation response genes' (CRGs) are highly enriched for genes critical for the cancer phenotype, thereby suggesting their causal role in the malignant state. Here, we show that CRGs have an essential role in drug-mediated anticancer activity and that anticancer agents can be identified through their ability to antagonize the CRG expression profile. These findings provide proof-of-concept for the use of the CRG signature as a novel means of drug discovery with relevance to underlying anticancer drug mechanisms.

  10. Differential gene expression analysis in ageing muscle and drug discovery perspectives.

    Science.gov (United States)

    Melouane, Aicha; Ghanemi, Abdelaziz; Aubé, Simon; Yoshioka, Mayumi; St-Amand, Jonny

    2018-01-01

    Identifying therapeutic target genes represents the key step in functional genomics-based therapies. Within this context, the disease heterogeneity, the exogenous factors and the complexity of genomic structure and function represent important challenges. The functional genomics aims to overcome such obstacles via identifying the gene functions and therefore highlight disease-causing genes as therapeutic targets. Genomic technologies promise to reshape the research on ageing muscle, exercise response and drug discovery. Herein, we describe the functional genomics strategies, mainly differential gene expression methods microarray, serial analysis of gene expression (SAGE), massively parallel signature sequence (MPSS), RNA sequencing (RNA seq), representational difference analysis (RDA), and suppression subtractive hybridization (SSH). Furthermore, we review these illustrative approaches that have been used to discover new therapeutic targets for some complex diseases along with the application of these tools to study the modulation of the skeletal muscle transcriptome. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

    Science.gov (United States)

    Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

    2009-09-03

    DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically

  12. Alternative Polyadenylation Patterns for Novel Gene Discovery and Classification in Cancer

    Directory of Open Access Journals (Sweden)

    Oguzhan Begik

    2017-07-01

    Full Text Available Certain aspects of diagnosis, prognosis, and treatment of cancer patients are still important challenges to be addressed. Therefore, we propose a pipeline to uncover patterns of alternative polyadenylation (APA, a hidden complexity in cancer transcriptomes, to further accelerate efforts to discover novel cancer genes and pathways. Here, we analyzed expression data for 1045 cancer patients and found a significant shift in usage of poly(A signals in common tumor types (breast, colon, lung, prostate, gastric, and ovarian compared to normal tissues. Using machine-learning techniques, we further defined specific subsets of APA events to efficiently classify cancer types. Furthermore, APA patterns were associated with altered protein levels in patients, revealed by antibody-based profiling data, suggesting functional significance. Overall, our study offers a computational approach for use of APA in novel gene discovery and classification in common tumor types, with important implications in basic research, biomarker discovery, and precision medicine approaches.

  13. Discovery-based strategies for studying platelet function.

    Science.gov (United States)

    Flaumenhaft, R; Dilks, J R

    2008-04-01

    The platelet is an anucleate cell, complicating efforts to study platelet function by traditional genetic means. Discovery-based strategies have lead to the identification of pharmacological agents capable of targeting specific proteins critical for platelet activation. This review will address the evolution of discovery-based strategies to identify probes that are at once useful reagents for studying platelet activation and effective therapeutics.

  14. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    Directory of Open Access Journals (Sweden)

    Nadine Norton

    Full Text Available Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988 between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads. Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol

  15. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    Directory of Open Access Journals (Sweden)

    Gasser Robin B

    2007-06-01

    Full Text Available Abstract Background Cholangiocarcinoma (CCA – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum than to free-living (Schmidtea mediterranea flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions

  16. Abiotic stress tolerance: from gene discovery in model organisms to crop improvement.

    Science.gov (United States)

    Bressan, Ray; Bohnert, Hans; Zhu, Jian-Kang

    2009-01-01

    Productive and sustainable agriculture necessitates growing plants in sub-optimal environments with less input of precious resources such as fresh water. For a better understanding and rapid improvement of abiotic stress tolerance, it is important to link physiological and biochemical work to molecular studies in genetically tractable model organisms. With the use of several technologies for the discovery of stress tolerance genes and their appropriate alleles, transgenic approaches to improving stress tolerance in crops remarkably parallels breeding principles with a greatly expanded germplasm base and will succeed eventually.

  17. Transcriptomic analysis and discovery of genes in the response of Arachis hypogaea to drought stress.

    Science.gov (United States)

    Zhao, Xiaobo; Li, Chunjuan; Wan, Shubo; Zhang, Tingting; Yan, Caixia; Shan, Shihua

    2018-04-01

    The peanut (Arachis hypogaea) is an important crop species that is threatened by drought stress. The genome sequences of peanut, which was officially released in 2016, may help explain the molecular mechanisms that underlie drought tolerance in this species. We report here a gene expression profiling of A. hypogaea to gain a global view of its drought resistance. Using whole-transcriptome sequencing, we analysed differential gene expression in response to drought stress in the drought-resistant peanut cultivar J11. Pooled samples obtained at 6, 12, 18, 24, and 48 h were compared with control samples at 0 h. In total, 51,554 genes were found, including 49,289 known genes and 2265 unknown genes. We identified 224 differentially expressed transcription factors, 296,335 SNPs and 28,391 InDELs. In addition, we detected significant differences in the gene expression profiles of the treatment and control groups. After comparing the two groups, 4648 genes were identified. An in-depth analysis of the data revealed that a large number of genes were associated with drought stress, including transcription factors and genes involved in photosynthesis-antenna proteins, carbon metabolism and the citrate cycle. The results of this study provide insights into the diverse mechanisms that underlie the successful establishment of drought resistance in the peanut, thereby facilitating the identification of important genes in the peanut related to drought management. Transcriptome analysis based on RNA-Seq is a powerful approach for gene discovery and molecular marker development for this species.

  18. Resource Discovery in Activity-Based Sensor Networks

    DEFF Research Database (Denmark)

    Bucur, Doina; Bardram, Jakob

    This paper proposes a service discovery protocol for sensor networks that is specifically tailored for use in humancentered pervasive environments. It uses the high-level concept of computational activities (as logical bundles of data and resources) to give sensors in Activity-Based Sensor Networks...... (ABSNs) knowledge about their usage even at the network layer. ABSN redesigns classical network-level service discovery protocols to include and use this logical structuring of the network for a more practically applicable service discovery scheme. Noting that in practical settings activity-based sensor...

  19. Application of genefishing discovery system on differential gene ...

    African Journals Online (AJOL)

    GREGORY

    2010-08-30

    Aug 30, 2010 ... this discovery system for a prokaryotic system by modifying the eukaryotic protocol using the poly (A)- ... eukaryotic system mainly in humans, screening of ... RNA isolation. Total RNA extraction from the bacterial cells was performed at room temperature using RNeasy® Mini Kit (Qiagen). Initially, the cells.

  20. The Matchmaker Exchange: a platform for rare disease gene discovery

    NARCIS (Netherlands)

    Philippakis, A.A.; Azzariti, D.R.; Beltran, S.; Brookes, A.J.; Brownstein, C.A.; Brudno, M.; Brunner, H.G.; Buske, O.J.; Carey, K.; Doll, C.; Dumitriu, S.; Dyke, S.O.M.; Dunnen, J.T. den; Firth, H.V.; Gibbs, R.A.; Girdea, M.; Gonzalez, M.; Haendel, M.A.; Hamosh, A.; Holm, I.A.; Huang, L.; Hurles, M.E.; Hutton, B.; Krier, J.B.; Misyura, A.; Mungall, C.J.; Paschall, J.; Paten, B.; Robinson, P.N.; Schiettecatte, F.; Sobreira, N.L.; Swaminathan, G.J.; Taschner, P.E.M.; Terry, S.F.; Washington, N.L.; Zuchner, S.; Boycott, K.M.; Rehm, H.L.

    2015-01-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of

  1. Gene discovery using next-generation pyrosequencing to develop ESTs for Phalaenopsis orchids

    Science.gov (United States)

    2011-01-01

    Background Orchids are one of the most diversified angiosperms, but few genomic resources are available for these non-model plants. In addition to the ecological significance, Phalaenopsis has been considered as an economically important floriculture industry worldwide. We aimed to use massively parallel 454 pyrosequencing for a global characterization of the Phalaenopsis transcriptome. Results To maximize sequence diversity, we pooled RNA from 10 samples of different tissues, various developmental stages, and biotic- or abiotic-stressed plants. We obtained 206,960 expressed sequence tags (ESTs) with an average read length of 228 bp. These reads were assembled into 8,233 contigs and 34,630 singletons. The unigenes were searched against the NCBI non-redundant (NR) protein database. Based on sequence similarity with known proteins, these analyses identified 22,234 different genes (E-value cutoff, e-7). Assembled sequences were annotated with Gene Ontology, Gene Family and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Among these annotations, over 780 unigenes encoding putative transcription factors were identified. Conclusion Pyrosequencing was effective in identifying a large set of unigenes from Phalaenopsis. The informative EST dataset we developed constitutes a much-needed resource for discovery of genes involved in various biological processes in Phalaenopsis and other orchid species. These transcribed sequences will narrow the gap between study of model organisms with many genomic resources and species that are important for ecological and evolutionary studies. PMID:21749684

  2. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar

    2004-01-01

    of human genes are functionally well characterized. It is still unclear how many and which human genes respond to estrogen treatment. We propose a simple, economic, yet effective computational method to predict a subclass of estrogen responsive genes. Our method relies on the similarity of ERE frames...

  3. Discovery of Putative Herbicide Resistance Genes and Its Regulatory Network in Chickpea Using Transcriptome Sequencing

    Directory of Open Access Journals (Sweden)

    Mir A. Iquebal

    2017-06-01

    Full Text Available Background: Chickpea (Cicer arietinum L. contributes 75% of total pulse production. Being cheaper than animal protein, makes it important in dietary requirement of developing countries. Weed not only competes with chickpea resulting into drastic yield reduction but also creates problem of harboring fungi, bacterial diseases and insect pests. Chemical approach having new herbicide discovery has constraint of limited lead molecule options, statutory regulations and environmental clearance. Through genetic approach, transgenic herbicide tolerant crop has given successful result but led to serious concern over ecological safety thus non-transgenic approach like marker assisted selection is desirable. Since large variability in tolerance limit of herbicide already exists in chickpea varieties, thus the genes offering herbicide tolerance can be introgressed in variety improvement programme. Transcriptome studies can discover such associated key genes with herbicide tolerance in chickpea.Results: This is first transcriptomic studies of chickpea or even any legume crop using two herbicide susceptible and tolerant genotypes exposed to imidazoline (Imazethapyr. Approximately 90 million paired-end reads generated from four samples were processed and assembled into 30,803 contigs using reference based assembly. We report 6,310 differentially expressed genes (DEGs, of which 3,037 were regulated by 980 miRNAs, 1,528 transcription factors associated with 897 DEGs, 47 Hub proteins, 3,540 putative Simple Sequence Repeat-Functional Domain Marker (SSR-FDM, 13,778 genic Single Nucleotide Polymorphism (SNP putative markers and 1,174 Indels. Randomly selected 20 DEGs were validated using qPCR. Pathway analysis suggested that xenobiotic degradation related gene, glutathione S-transferase (GST were only up-regulated in presence of herbicide. Down-regulation of DNA replication genes and up-regulation of abscisic acid pathway genes were observed. Study further reveals

  4. Traditional Chinese Medicine-Based Network Pharmacology Could Lead to New Multicompound Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jian Li

    2012-01-01

    Full Text Available Current strategies for drug discovery have reached a bottleneck where the paradigm is generally “one gene, one drug, one disease.” However, using holistic and systemic views, network pharmacology may be the next paradigm in drug discovery. Based on network pharmacology, a combinational drug with two or more compounds could offer beneficial synergistic effects for complex diseases. Interestingly, traditional chinese medicine (TCM has been practicing holistic views for over 3,000 years, and its distinguished feature is using herbal formulas to treat diseases based on the unique pattern classification. Though TCM herbal formulas are acknowledged as a great source for drug discovery, no drug discovery strategies compatible with the multidimensional complexities of TCM herbal formulas have been developed. In this paper, we highlighted some novel paradigms in TCM-based network pharmacology and new drug discovery. A multiple compound drug can be discovered by merging herbal formula-based pharmacological networks with TCM pattern-based disease molecular networks. Herbal formulas would be a source for multiple compound drug candidates, and the TCM pattern in the disease would be an indication for a new drug.

  5. SECURE SERVICE DISCOVERY BASED ON PROBE PACKET MECHANISM FOR MANETS

    Directory of Open Access Journals (Sweden)

    S. Pariselvam

    2015-03-01

    Full Text Available In MANETs, Service discovery process is always considered to be crucial since they do not possess a centralized infrastructure for communication. Moreover, different services available through the network necessitate varying categories. Hence, a need arises for devising a secure probe based service discovery mechanism to reduce the complexity in providing the services to the network users. In this paper, we propose a Secure Service Discovery Based on Probe Packet Mechanism (SSDPPM for identifying the DoS attack in MANETs, which depicts a new approach for estimating the level of trust present in each and every routing path of a mobile ad hoc network by using probe packets. Probing based service discovery mechanisms mainly identifies a mobile node’s genuineness using a test packet called probe that travels the entire network for the sake of computing the degree of trust maintained between the mobile nodes and it’s attributed impact towards the network performance. The performance of SSDPPM is investigated through a wide range of network related parameters like packet delivery, throughput, Control overhead and total overhead using the version ns-2.26 network simulator. This mechanism SSDPPM, improves the performance of the network in an average by 23% and 19% in terms of packet delivery ratio and throughput than the existing service discovery mechanisms available in the literature.

  6. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    Science.gov (United States)

    Roffler, Gretchen H.; Amish, Stephen J.; Smith, Seth; Cosart, Ted F.; Kardos, Marty; Schwartz, Michael K.; Luikart, Gordon

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

  7. Riboswitches: discovery of drugs that target bacterial gene-regulatory RNAs

    Science.gov (United States)

    Deigan, Katherine E.; Ferré-D’Amaré, Adrian R.

    2011-01-01

    Conspectus Riboswitches, which were discovered in the first years of the XXI century, are gene-regulatory mRNA domains that respond to the intracellular concentration of a variety of metabolites and second messengers. They control essential genes in many pathogenic bacteria, and represent a new class of biomolecular target for the development of antibiotics and chemical-biological tools. Five mechanisms of gene regulation are known for riboswitches. Most bacterial riboswitches modulate transcription termination or translation initiation in response to ligand binding. All known examples of eukaryotic riboswitches and some bacterial riboswitches control gene expression by alternative splicing. The glmS riboswitch, widespread in Gram-positive bacteria, is a catalytic RNA activated by ligand binding. Its self-cleavage destabilizes the mRNA of which it is part. Finally, one example of trans-acting riboswitch is known. Three-dimensional (3D) structures have been determined of representatives of thirteen structurally distinct riboswitch classes, providing atomic-level insight into their mechanisms of ligand recognition. While cellular and viral RNAs in general have attracted interest as potential drug targets, riboswitches show special promise due to the diversity and sophistication of small molecule recognition strategies on display in their ligand binding pockets. Moreover, uniquely among known structured RNA domains, riboswitches evolved to recognize small molecule ligands. Structural and biochemical advances in the study of riboswitches provide an impetus for the development of methods for the discovery of novel riboswitch activators and inhibitors. Recent rational drug design efforts focused on select riboswitch classes have yielded a small number of candidate antibiotic compounds, including one active in a mouse model of Staphylococcus aureus infection. The development of high-throughput methods suitable for riboswitch-specific drug discovery is ongoing. A fragment-based

  8. Comparative Oncogenomics for Peripheral Nerve Sheath Cancer Gene Discovery

    Science.gov (United States)

    2015-06-01

    and growth factor receptors potentially upstream of some of these signaling cascades (the growth hormone receptor gene Ghr, Il17a, Inhbe) were...Loss of p16 (INK4A) expression is associated with allelic imbalance /loss of heterozygosity of chromosome 9p21 in microdissected malignant peripheral...cell receptor genes) Antigen recognition Ghr (growth hormone receptor) Growth hormone receptor Myc (myelocytomatosis oncogene) Nuclear phosphoprotein

  9. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  10. Using concepts in literature-based discovery : Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries

    NARCIS (Netherlands)

    Weeber, M; Klein, Henny; de Jong-van den Berg, LTW; Vos, R

    Literature-based discovery has resulted in new knowledge. In the biomedical context, Don R. Swanson has generated several literature-based hypotheses that have been corroborated experimentally and clinically. In this paper, we propose a two-step model of the discovery process in which hypotheses are

  11. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    Directory of Open Access Journals (Sweden)

    Broberg Per

    2005-08-01

    Full Text Available Abstract Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information

  12. Gene discovery in the horned beetle Onthophagus taurus

    Directory of Open Access Journals (Sweden)

    Yang Youngik

    2010-12-01

    Full Text Available Abstract Background Horned beetles, in particular in the genus Onthophagus, are important models for studies on sexual selection, biological radiations, the origin of novel traits, developmental plasticity, biocontrol, conservation, and forensic biology. Despite their growing prominence as models for studying both basic and applied questions in biology, little genomic or transcriptomic data are available for this genus. We used massively parallel pyrosequencing (Roche 454-FLX platform to produce a comprehensive EST dataset for the horned beetle Onthophagus taurus. To maximize sequence diversity, we pooled RNA extracted from a normalized library encompassing diverse developmental stages and both sexes. Results We used 454 pyrosequencing to sequence ESTs from all post-embryonic stages of O. taurus. Approximately 1.36 million reads assembled into 50,080 non-redundant sequences encompassing a total of 26.5 Mbp. The non-redundant sequences match over half of the genes in Tribolium castaneum, the most closely related species with a sequenced genome. Analyses of Gene Ontology annotations and biochemical pathways indicate that the O. taurus sequences reflect a wide and representative sampling of biological functions and biochemical processes. An analysis of sequence polymorphisms revealed that SNP frequency was negatively related to overall expression level and the number of tissue types in which a given gene is expressed. The most variable genes were enriched for a limited number of GO annotations whereas the least variable genes were enriched for a wide range of GO terms directly related to fitness. Conclusions This study provides the first large-scale EST database for horned beetles, a much-needed resource for advancing the study of these organisms. Furthermore, we identified instances of gene duplications and alternative splicing, useful for future study of gene regulation, and a large number of SNP markers that could be used in population

  13. Discovery of Cationic Polymers for Non-viral Gene Delivery using Combinatorial Approaches

    Science.gov (United States)

    Barua, Sutapa; Ramos, James; Potta, Thrimoorthy; Taylor, David; Huang, Huang-Chiao; Montanez, Gabriela; Rege, Kaushal

    2015-01-01

    Gene therapy is an attractive treatment option for diseases of genetic origin, including several cancers and cardiovascular diseases. While viruses are effective vectors for delivering exogenous genes to cells, concerns related to insertional mutagenesis, immunogenicity, lack of tropism, decay and high production costs necessitate the discovery of non-viral methods. Significant efforts have been focused on cationic polymers as non-viral alternatives for gene delivery. Recent studies have employed combinatorial syntheses and parallel screening methods for enhancing the efficacy of gene delivery, biocompatibility of the delivery vehicle, and overcoming cellular level barriers as they relate to polymer-mediated transgene uptake, transport, transcription, and expression. This review summarizes and discusses recent advances in combinatorial syntheses and parallel screening of cationic polymer libraries for the discovery of efficient and safe gene delivery systems. PMID:21843141

  14. Phylogeny based discovery of regulatory elements

    Directory of Open Access Journals (Sweden)

    Cohen Barak A

    2006-05-01

    Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.

  15. Discovery of group I introns in the nuclear small subunit ribosomal RNA genes of Acanthamoeba.

    Science.gov (United States)

    Gast, R J; Fuerst, P A; Byers, T J

    1994-01-01

    The discovery of group I introns in small subunit nuclear rDNA (nsrDNA) is becoming more common as the effort to generate phylogenies based upon nsrDNA sequences grows. In this paper we describe the discovery of the first two group I introns in the nsrDNA from the genus Acanthamoeba. The introns are in different locations in the genes, and have no significant primary sequence similarity to each other. They are identified as group I introns by the conserved P, Q, R and S sequences (1), and the ability to fit the sequences to a consensus secondary structure model for the group I introns (1, 2). Both introns are absent from the mature srRNA. A BLAST search (3) of nucleic acid sequences present in GenBank and EMBL revealed that the A. griffini intron was most similar to the nsrDNA group I intron of the green alga Dunaliella parva. A similar search found that the A. lenticulata intron was not similar to any of the other reported group I introns. Images PMID:8127708

  16. A combination of gene expression ranking and co-expression network analysis increases discovery rate in large-scale mutant screens for novel Arabidopsis thaliana abiotic stress genes.

    Science.gov (United States)

    Ransbotyn, Vanessa; Yeger-Lotem, Esti; Basha, Omer; Acuna, Tania; Verduyn, Christoph; Gordon, Michal; Chalifa-Caspi, Vered; Hannah, Matthew A; Barak, Simon

    2015-05-01

    As challenges to food security increase, the demand for lead genes for improving crop production is growing. However, genetic screens of plant mutants typically yield very low frequencies of desired phenotypes. Here, we present a powerful computational approach for selecting candidate genes for screening insertion mutants. We combined ranking of Arabidopsis thaliana regulatory genes according to their expression in response to multiple abiotic stresses (Multiple Stress [MST] score), with stress-responsive RNA co-expression network analysis to select candidate multiple stress regulatory (MSTR) genes. Screening of 62 T-DNA insertion mutants defective in candidate MSTR genes, for abiotic stress germination phenotypes yielded a remarkable hit rate of up to 62%; this gene discovery rate is 48-fold greater than that of other large-scale insertional mutant screens. Moreover, the MST score of these genes could be used to prioritize them for screening. To evaluate the contribution of the co-expression analysis, we screened 64 additional mutant lines of MST-scored genes that did not appear in the RNA co-expression network. The screening of these MST-scored genes yielded a gene discovery rate of 36%, which is much higher than that of classic mutant screens but not as high as when picking candidate genes from the co-expression network. The MSTR co-expression network that we created, AraSTressRegNet is publicly available at http://netbio.bgu.ac.il/arnet. This systems biology-based screening approach combining gene ranking and network analysis could be generally applicable to enhancing identification of genes regulating additional processes in plants and other organisms provided that suitable transcriptome data are available. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  17. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    DEFF Research Database (Denmark)

    Feng, Cai-ping; Mundy, J.

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions, TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also...

  18. Discovery of Novel Gene Elements Associated with Prostate Cancer Progression

    Science.gov (United States)

    2014-12-01

    buffer [Tris-buffered saline, 0.1% Tween (TBS-T), 5% nonfat dry milk ] and incubated at 4C with the appropriate antibody. Following incubation, the...prostate carcinoma during hormonal therapy identifies androgen-responsive genes and mechanisms of therapy resistance. Am. J. Pathol. 164, 217–227...proteins. Proteins were transferred onto PVDF membrane and blocked for 90 min in block- ing buffer (5% milk in a solution of 0.1% Tween-20 in Tris

  19. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  20. Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes.

    Science.gov (United States)

    Hassani-Pak, Keywan; Rawlings, Christopher

    2017-06-13

    Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.

  1. Discovery AP2/ERF family genes in silico in Medicago truncatula

    African Journals Online (AJOL)

    aghomotsegin

    Discovery AP2/ERF family genes in silico in. Medicago truncatula. Zhifei Zhang*, Qian Zhou, Zhijian Yang and Jingpeng Jiang. College of Agronomy, Hunan Agricultural University, Furong District, Changsha, Hunan Province 410128, P.R. China. Accepted 27 May, 2013. Medicago truncatula is a legume model plant due to ...

  2. Effectiveness of Discovery Learning-Based Transformation Geometry Module

    Science.gov (United States)

    Febriana, R.; Haryono, Y.; Yusri, R.

    2017-09-01

    Development of transformation geometry module is conducted because the students got difficulties to understand the existing book. The purpose of the research was to find out the effectiveness of discovery learning-based transformation geometry module toward student’s activity. Model of the development was Plomp model consisting preliminary research, prototyping phase and assessment phase. The research was focused on assessment phase where it was to observe the designed product effectiveness. The instrument was observation sheet. The observed activities were visual activities, oral activities, listening activities, mental activities, emotional activities and motor activities. Based on the result of the research, it is found that visual activities, learning activities, writing activities, the student’s activity is in the criteria very effective. It can be concluded that the use of discovery learning-based transformation geometry module use can increase the positive student’s activity and decrease the negative activity.

  3. Rule extraction in gene-disease relationship discovery.

    Science.gov (United States)

    Hou, Wen-Juan; Chen, Hsiao-Yuan

    2013-04-10

    Biomedical data available to researchers and clinicians have increased dramatically over the past years because of the exponential growth of knowledge in medical biology. It is difficult for curators to go through all of the unstructured documents so as to curate the information to the database. Associating genes with diseases is important because it is a fundamental challenge in human health with applications to understanding disease properties and developing new techniques for prevention, diagnosis and therapy. Our study uses the automatic rule-learning approach to gene-disease relationship extraction. We first prepare the experimental corpus from MEDLINE and OMIM. A parser is applied to produce some grammatical information. We then learn all possible rules that discriminate relevant from irrelevant sentences. After that, we compute the scores of the learned rules in order to select rules of interest. As a result, a set of rules is generated. We produce the learned rules automatically from the 1000 positive and 1000 negative sentences. The test set includes 400 sentences composed of 200 positives and 200 negatives. Precision, recall and F-score served as our evaluation metrics. The results reveal that the maximal precision rate is 77.8% and the maximal recall rate is 63.5%. The maximal F-score is 66.9% where the precision rate is 70.6% and the recall rate is 63.5%. We employ the rule-learning approach to extract gene-disease relationships. Our main contributions are to build rules automatically and to support a more complete set of rules than a manually generated one. The experiments show exhilarating results and some improving efforts will be made in the future. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.

  4. Profile-based short linear protein motif discovery

    Directory of Open Access Journals (Sweden)

    Haslam Niall J

    2012-05-01

    Full Text Available Abstract Background Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. Results The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. Conclusions Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods.

  5. Comparing gene discovery from Affymetrix GeneChip microarrays and Clontech PCR-select cDNA subtraction: a case study

    Science.gov (United States)

    Cao, Wuxiong; Epstein, Charles; Liu, Hong; DeLoughery, Craig; Ge, Nanxiang; Lin, Jieyi; Diao, Rong; Cao, Hui; Long, Fan; Zhang, Xin; Chen, Yangde; Wright, Paul S; Busch, Steve; Wenck, Michelle; Wong, Karen; Saltzman, Alan G; Tang, Zhihua; Liu, Li; Zilberstein, Asher

    2004-01-01

    Background Several high throughput technologies have been employed to identify differentially regulated genes that may be molecular targets for drug discovery. Here we compared the sets of differentially regulated genes discovered using two experimental approaches: a subtracted suppressive hybridization (SSH) cDNA library methodology and Affymetrix GeneChip® technology. In this "case study" we explored the transcriptional pattern changes during the in vitro differentiation of human monocytes to myeloid dendritic cells (DC), and evaluated the potential for novel gene discovery using the SSH methodology. Results The same RNA samples isolated from peripheral blood monocyte precursors and immature DC (iDC) were used for GeneChip microarray probing and SSH cDNA library construction. 10,000 clones from each of the two-way SSH libraries (iDC-monocytes and monocytes-iDC) were picked for sequencing. About 2000 transcripts were identified for each library from 8000 successful sequences. Only 70% to 75% of these transcripts were represented on the U95 series GeneChip microarrays, implying that 25% to 30% of these transcripts might not have been identified in a study based only on GeneChip microarrays. In addition, about 10% of these transcripts appeared to be "novel", although these have not yet been closely examined. Among the transcripts that are also represented on the chips, about a third were concordantly discovered as differentially regulated between iDC and monocytes by GeneChip microarray transcript profiling. The remaining two thirds were either not inferred as differentially regulated from GeneChip microarray data, or were called differentially regulated but in the opposite direction. This underscores the importance both of generating reciprocal pairs of SSH libraries, and of real-time RT-PCR confirmation of the results. Conclusions This study suggests that SSH could be used as an alternative and complementary transcript profiling tool to GeneChip microarrays

  6. From crystal to compound: structure-based antimalarial drug discovery.

    Science.gov (United States)

    Drinkwater, Nyssa; McGowan, Sheena

    2014-08-01

    Despite a century of control and eradication campaigns, malaria remains one of the world's most devastating diseases. Our once-powerful therapeutic weapons are losing the war against the Plasmodium parasite, whose ability to rapidly develop and spread drug resistance hamper past and present malaria-control efforts. Finding new and effective treatments for malaria is now a top global health priority, fuelling an increase in funding and promoting open-source collaborations between researchers and pharmaceutical consortia around the world. The result of this is rapid advances in drug discovery approaches and technologies, with three major methods for antimalarial drug development emerging: (i) chemistry-based, (ii) target-based, and (iii) cell-based. Common to all three of these approaches is the unique ability of structural biology to inform and accelerate drug development. Where possible, SBDD (structure-based drug discovery) is a foundation for antimalarial drug development programmes, and has been invaluable to the development of a number of current pre-clinical and clinical candidates. However, as we expand our understanding of the malarial life cycle and mechanisms of resistance development, SBDD as a field must continue to evolve in order to develop compounds that adhere to the ideal characteristics for novel antimalarial therapeutics and to avoid high attrition rates pre- and post-clinic. In the present review, we aim to examine the contribution that SBDD has made to current antimalarial drug development efforts, covering hit discovery to lead optimization and prevention of parasite resistance. Finally, the potential for structural biology, particularly high-throughput structural genomics programmes, to identify future targets for drug discovery are discussed.

  7. Biomarker discovery in mass spectrometry-based urinary proteomics.

    Science.gov (United States)

    Thomas, Samuel; Hao, Ling; Ricke, William A; Li, Lingjun

    2016-04-01

    Urinary proteomics has become one of the most attractive topics in disease biomarker discovery. MS-based proteomic analysis has advanced continuously and emerged as a prominent tool in the field of clinical bioanalysis. However, only few protein biomarkers have made their way to validation and clinical practice. Biomarker discovery is challenged by many clinical and analytical factors including, but not limited to, the complexity of urine and the wide dynamic range of endogenous proteins in the sample. This article highlights promising technologies and strategies in the MS-based biomarker discovery process, including study design, sample preparation, protein quantification, instrumental platforms, and bioinformatics. Different proteomics approaches are discussed, and progresses in maximizing urinary proteome coverage and standardization are emphasized in this review. MS-based urinary proteomics has great potential in the development of noninvasive diagnostic assays in the future, which will require collaborative efforts between analytical scientists, systems biologists, and clinicians. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Context-driven discovery of gene cassettes in mobile integrons using a computational grammar.

    Science.gov (United States)

    Tsafnat, Guy; Coiera, Enrico; Partridge, Sally R; Schaeffer, Jaron; Iredell, Jon R

    2009-09-08

    Gene discovery algorithms typically examine sequence data for low level patterns. A novel method to computationally discover higher order DNA structures is presented, using a context sensitive grammar. The algorithm was applied to the discovery of gene cassettes associated with integrons. The discovery and annotation of antibiotic resistance genes in such cassettes is essential for effective monitoring of antibiotic resistance patterns and formulation of public health antibiotic prescription policies. We discovered two new putative gene cassettes using the method, from 276 integron features and 978 GenBank sequences. The system achieved kappa = 0.972 annotation agreement with an expert gold standard of 300 sequences. In rediscovery experiments, we deleted 789,196 cassette instances over 2030 experiments and correctly relabelled 85.6% (alpha > or = 95%, E analysis demonstrated that for 72,338 missed deletions, two adjacent deleted cassettes were labeled as a single cassette, increasing performance to 94.8% (mean sensitivity = 0.92, specificity = 1, F-score = 0.96). Using grammars we were able to represent heuristic background knowledge about large and complex structures in DNA. Importantly, we were also able to use the context embedded in the model to discover new putative antibiotic resistance gene cassettes. The method is complementary to existing automatic annotation systems which operate at the sequence level.

  9. Context-driven discovery of gene cassettes in mobile integrons using a computational grammar

    Directory of Open Access Journals (Sweden)

    Schaeffer Jaron

    2009-09-01

    Full Text Available Abstract Background Gene discovery algorithms typically examine sequence data for low level patterns. A novel method to computationally discover higher order DNA structures is presented, using a context sensitive grammar. The algorithm was applied to the discovery of gene cassettes associated with integrons. The discovery and annotation of antibiotic resistance genes in such cassettes is essential for effective monitoring of antibiotic resistance patterns and formulation of public health antibiotic prescription policies. Results We discovered two new putative gene cassettes using the method, from 276 integron features and 978 GenBank sequences. The system achieved κ = 0.972 annotation agreement with an expert gold standard of 300 sequences. In rediscovery experiments, we deleted 789,196 cassette instances over 2030 experiments and correctly relabelled 85.6% (α ≥ 95%, E ≤ 1%, mean sensitivity = 0.86, specificity = 1, F-score = 0.93, with no false positives. Error analysis demonstrated that for 72,338 missed deletions, two adjacent deleted cassettes were labeled as a single cassette, increasing performance to 94.8% (mean sensitivity = 0.92, specificity = 1, F-score = 0.96. Conclusion Using grammars we were able to represent heuristic background knowledge about large and complex structures in DNA. Importantly, we were also able to use the context embedded in the model to discover new putative antibiotic resistance gene cassettes. The method is complementary to existing automatic annotation systems which operate at the sequence level.

  10. Marfan Syndrome and Related Disorders: 25 Years of Gene Discovery.

    Science.gov (United States)

    Verstraeten, Aline; Alaerts, Maaike; Van Laer, Lut; Loeys, Bart

    2016-06-01

    Marfan syndrome (MFS) is a rare, autosomal-dominant, multisystem disorder, presenting with skeletal, ocular, skin, and cardiovascular symptoms. Significant clinical overlap with other systemic connective tissue diseases, including Loeys-Dietz syndrome (LDS), Shprintzen-Goldberg syndrome (SGS), and the MASS phenotype, has been documented. In MFS and LDS, the cardiovascular manifestations account for the major cause of patient morbidity and mortality, rendering them the main target for therapeutic intervention. Over the past decades, gene identification studies confidently linked the aforementioned syndromes, as well as nonsyndromic aneurysmal disease, to genetic defects in proteins related to the transforming growth factor (TGF)-β pathway, greatly expanding our knowledge on the disease mechanisms and providing us with novel therapeutic targets. As a result, the focus of the developing pharmacological treatment strategies is shifting from hemodynamic stress management to TGF-β antagonism. In this review, we discuss the insights that have been gained in the molecular biology of MFS and related disorders over the past 25 years. © 2016 WILEY PERIODICALS, INC.

  11. Cross-pollination of research findings, although uncommon, may accelerate discovery of human disease genes

    Directory of Open Access Journals (Sweden)

    Duda Marlena

    2012-11-01

    Full Text Available Abstract Background Technological leaps in genome sequencing have resulted in a surge in discovery of human disease genes. These discoveries have led to increased clarity on the molecular pathology of disease and have also demonstrated considerable overlap in the genetic roots of human diseases. In light of this large genetic overlap, we tested whether cross-disease research approaches lead to faster, more impactful discoveries. Methods We leveraged several gene-disease association databases to calculate a Mutual Citation Score (MCS for 10,853 pairs of genetically related diseases to measure the frequency of cross-citation between research fields. To assess the importance of cooperative research, we computed an Individual Disease Cooperation Score (ICS and the average publication rate for each disease. Results For all disease pairs with one gene in common, we found that the degree of genetic overlap was a poor predictor of cooperation (r2=0.3198 and that the vast majority of disease pairs (89.56% never cited previous discoveries of the same gene in a different disease, irrespective of the level of genetic similarity between the diseases. A fraction (0.25% of the pairs demonstrated cross-citation in greater than 5% of their published genetic discoveries and 0.037% cross-referenced discoveries more than 10% of the time. We found strong positive correlations between ICS and publication rate (r2=0.7931, and an even stronger correlation between the publication rate and the number of cross-referenced diseases (r2=0.8585. These results suggested that cross-disease research may have the potential to yield novel discoveries at a faster pace than singular disease research. Conclusions Our findings suggest that the frequency of cross-disease study is low despite the high level of genetic similarity among many human diseases, and that collaborative methods may accelerate and increase the impact of new genetic discoveries. Until we have a better

  12. Biomarker discovery for colon cancer using a 761 gene RT-PCR assay

    Directory of Open Access Journals (Sweden)

    Hackett James R

    2007-08-01

    Full Text Available Abstract Background Reverse transcription PCR (RT-PCR is widely recognized to be the gold standard method for quantifying gene expression. Studies using RT-PCR technology as a discovery tool have historically been limited to relatively small gene sets compared to other gene expression platforms such as microarrays. We have recently shown that TaqMan® RT-PCR can be scaled up to profile expression for 192 genes in fixed paraffin-embedded (FPE clinical study tumor specimens. This technology has also been used to develop and commercialize a widely used clinical test for breast cancer prognosis and prediction, the Onco typeDX™ assay. A similar need exists in colon cancer for a test that provides information on the likelihood of disease recurrence in colon cancer (prognosis and the likelihood of tumor response to standard chemotherapy regimens (prediction. We have now scaled our RT-PCR assay to efficiently screen 761 biomarkers across hundreds of patient samples and applied this process to biomarker discovery in colon cancer. This screening strategy remains attractive due to the inherent advantages of maintaining platform consistency from discovery through clinical application. Results RNA was extracted from formalin fixed paraffin embedded (FPE tissue, as old as 28 years, from 354 patients enrolled in NSABP C-01 and C-02 colon cancer studies. Multiplexed reverse transcription reactions were performed using a gene specific primer pool containing 761 unique primers. PCR was performed as independent TaqMan® reactions for each candidate gene. Hierarchal clustering demonstrates that genes expected to co-express form obvious, distinct and in certain cases very tightly correlated clusters, validating the reliability of this technical approach to biomarker discovery. Conclusion We have developed a high throughput, quantitatively precise multi-analyte gene expression platform for biomarker discovery that approaches low density DNA arrays in numbers of

  13. Wide-Area Publish/Subscribe Mobile Resource Discovery Based on IPv6 GeoNetworking

    OpenAIRE

    Noguchi, Satoru; Matsuura, Satoshi; Inomata, Atsuo; Fujikawa, Kazutoshi; Sunahara, Hideki

    2013-01-01

    Resource discovery is an essential function for distributed mobile applications integrated in vehicular communication systems. Key requirements of the mobile resource discovery are wide-area geographic-based discovery and scalable resource discovery not only inside a vehicular ad-hoc network but also through the Internet. While a number of resource discovery solutions have been proposed, most of them have focused on specific scale of network. Furthermore, managing a large number of mobile res...

  14. Graphene based gene transfection

    Science.gov (United States)

    Feng, Liangzhu; Zhang, Shuai; Liu, Zhuang

    2011-03-01

    Graphene as a star in materials research has been attracting tremendous attentions in the past few years in various fields including biomedicine. In this work, for the first time we successfully use graphene as a non-toxic nano-vehicle for efficient gene transfection. Graphene oxide (GO) is bound with cationic polymers, polyethyleneimine (PEI) with two different molecular weights at 1.2 kDa and 10 kDa, forming GO-PEI-1.2k and GO-PEG-10k complexes, respectively, both of which are stable in physiological solutions. Cellular toxicity tests reveal that our GO-PEI-10k complex exhibits significantly reduced toxicity to the treated cells compared to the bare PEI-10k polymer. The positively charged GO-PEI complexes are able to further bind with plasmid DNA (pDNA) for intracellular transfection of the enhanced green fluorescence protein (EGFP) gene in HeLa cells. While EGFP transfection with PEI-1.2k appears to be ineffective, high EGFP expression is observed using the corresponding GO-PEI-1.2k as the transfection agent. On the other hand, GO-PEI-10k shows similar EGFP transfection efficiency but lower toxicity compared with PEI-10k. Our results suggest graphene to be a novel gene delivery nano-vector with low cytotoxicity and high transfection efficiency, promising for future applications in non-viral based gene therapy.Graphene as a star in materials research has been attracting tremendous attentions in the past few years in various fields including biomedicine. In this work, for the first time we successfully use graphene as a non-toxic nano-vehicle for efficient gene transfection. Graphene oxide (GO) is bound with cationic polymers, polyethyleneimine (PEI) with two different molecular weights at 1.2 kDa and 10 kDa, forming GO-PEI-1.2k and GO-PEG-10k complexes, respectively, both of which are stable in physiological solutions. Cellular toxicity tests reveal that our GO-PEI-10k complex exhibits significantly reduced toxicity to the treated cells compared to the bare PEI

  15. Data Mining and Knowledge Discovery via Logic-Based Methods

    CERN Document Server

    Triantaphyllou, Evangelos

    2010-01-01

    There are many approaches to data mining and knowledge discovery (DM&KD), including neural networks, closest neighbor methods, and various statistical methods. This monograph, however, focuses on the development and use of a novel approach, based on mathematical logic, that the author and his research associates have worked on over the last 20 years. The methods presented in the book deal with key DM&KD issues in an intuitive manner and in a natural sequence. Compared to other DM&KD methods, those based on mathematical logic offer a direct and often intuitive approach for extracting easily int

  16. Functional Gene Discovery and Characterization of Genes and Alleles Affecting Wood Biomass Yield and Quality in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Busov, Victor [Michigan Technological Univ., Houghton, MI (United States)

    2017-02-12

    Adoption of biofuels as economically and environmentally viable alternative to fossil fuels would require development of specialized bioenergy varieties. A major goal in the breeding of such varieties is the improvement of lignocellulosic biomass yield and quality. These are complex traits and understanding the underpinning molecular mechanism can assist and accelerate their improvement. This is particularly important for tree bioenergy crops like poplars (species and hybrids from the genus Populus), for which breeding progress is extremely slow due to long generation cycles. A variety of approaches have been already undertaken to better understand the molecular bases of biomass yield and quality in poplar. An obvious void in these undertakings has been the application of mutagenesis. Mutagenesis has been instrumental in the discovery and characterization of many plant traits including such that affect biomass yield and quality. In this proposal we use activation tagging to discover genes that can significantly affect biomass associated traits directly in poplar, a premier bioenergy crop. We screened a population of 5,000 independent poplar activation tagging lines under greenhouse conditions for a battery of biomass yield traits. These same plants were then analyzed for changes in wood chemistry using pyMBMS. As a result of these screens we have identified nearly 800 mutants, which are significantly (P<0.05) different when compared to wild type. Of these majority (~700) are affected in one of ten different biomass yield traits and 100 in biomass quality traits (e.g., lignin, S/G ration and C6/C5 sugars). We successfully recovered the position of the tag in approximately 130 lines, showed activation in nearly half of them and performed recapitulation experiments with 20 genes prioritized by the significance of the phenotype. Recapitulation experiments are still ongoing for many of the genes but the results are encouraging. For example, we have shown successful

  17. ACFIS: a web server for fragment-based drug discovery

    Science.gov (United States)

    Hao, Ge-Fei; Jiang, Wen; Ye, Yuan-Nong; Wu, Feng-Xu; Zhu, Xiao-Lei; Guo, Feng-Biao; Yang, Guang-Fu

    2016-01-01

    In order to foster innovation and improve the effectiveness of drug discovery, there is a considerable interest in exploring unknown ‘chemical space’ to identify new bioactive compounds with novel and diverse scaffolds. Hence, fragment-based drug discovery (FBDD) was developed rapidly due to its advanced expansive search for ‘chemical space’, which can lead to a higher hit rate and ligand efficiency (LE). However, computational screening of fragments is always hampered by the promiscuous binding model. In this study, we developed a new web server Auto Core Fragment in silico Screening (ACFIS). It includes three computational modules, PARA_GEN, CORE_GEN and CAND_GEN. ACFIS can generate core fragment structure from the active molecule using fragment deconstruction analysis and perform in silico screening by growing fragments to the junction of core fragment structure. An integrated energy calculation rapidly identifies which fragments fit the binding site of a protein. We constructed a simple interface to enable users to view top-ranking molecules in 2D and the binding mode in 3D for further experimental exploration. This makes the ACFIS a highly valuable tool for drug discovery. The ACFIS web server is free and open to all users at http://chemyang.ccnu.edu.cn/ccb/server/ACFIS/. PMID:27150808

  18. Discovery of nutritional biomarkers: future directions based on omics technologies.

    Science.gov (United States)

    Odriozola, Leticia; Corrales, Fernado J

    2015-07-01

    Understanding the interactions between food and human biology is of utmost importance to facilitate the development of more efficient nutritional interventions that might improve our wellness status and future health outcomes by reducing risk factors for non-transmittable chronic diseases, such as cardiovascular diseases, cancer, obesity and metabolic syndrome. Dissection of the molecular mechanisms that mediate the physiological effects of diets and bioactive compounds is one of the main goals of current nutritional investigation and the food industry as might lead to the discovery of novel biomarkers. It is widely recognized that the availability of robust nutritional biomarkers represents a bottleneck that delays the innovation process of the food industry. In this regard, omics sciences have opened up new avenues of research and opportunities in nutrition. Advances in mass spectrometry, nuclear magnetic resonance, next generation sequencing and microarray technologies allow massive genome, gene expression, proteomic and metabolomic profiling, obtaining a global and in-depth analysis of physiological/pathological scenarios. For this reason, omics platforms are most suitable for the discovery and characterization of novel nutritional markers that will define the nutritional status of both individuals and populations in the near future, and to identify the nutritional bioactive compounds responsible for the health outcomes.

  19. Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer.

    Science.gov (United States)

    Goering, Anthony W; McClure, Ryan A; Doroghazi, James R; Albright, Jessica C; Haverland, Nicole A; Zhang, Yongbo; Ju, Kou-San; Thomson, Regan J; Metcalf, William W; Kelleher, Neil L

    2016-02-24

    For more than half a century the pharmaceutical industry has sifted through natural products produced by microbes, uncovering new scaffolds and fashioning them into a broad range of vital drugs. We sought a strategy to reinvigorate the discovery of natural products with distinctive structures using bacterial genome sequencing combined with metabolomics. By correlating genetic content from 178 actinomycete genomes with mass spectrometry-enabled analyses of their exported metabolomes, we paired new secondary metabolites with their biosynthetic gene clusters. We report the use of this new approach to isolate and characterize tambromycin, a new chlorinated natural product, composed of several nonstandard amino acid monomeric units, including a unique pyrrolidine-containing amino acid we name tambroline. Tambromycin shows antiproliferative activity against cancerous human B- and T-cell lines. The discovery of tambromycin via large-scale correlation of gene clusters with metabolites (a.k.a. metabologenomics) illuminates a path for structure-based discovery of natural products at a sharply increased rate.

  20. Current NMR Techniques for Structure-Based Drug Discovery.

    Science.gov (United States)

    Sugiki, Toshihiko; Furuita, Kyoko; Fujiwara, Toshimichi; Kojima, Chojiro

    2018-01-12

    A variety of nuclear magnetic resonance (NMR) applications have been developed for structure-based drug discovery (SBDD). NMR provides many advantages over other methods, such as the ability to directly observe chemical compounds and target biomolecules, and to be used for ligand-based and protein-based approaches. NMR can also provide important information about the interactions in a protein-ligand complex, such as structure, dynamics, and affinity, even when the interaction is too weak to be detected by ELISA or fluorescence resonance energy transfer (FRET)-based high-throughput screening (HTS) or to be crystalized. In this study, we reviewed current NMR techniques. We focused on recent progress in NMR measurement and sample preparation techniques that have expanded the potential of NMR-based SBDD, such as fluorine NMR ( 19 F-NMR) screening, structure modeling of weak complexes, and site-specific isotope labeling of challenging targets.

  1. The golden era of ocular disease gene discovery: Race to the finish

    Science.gov (United States)

    Swaroop, A; Sieving, PA

    2014-01-01

    Within the last decade, technological advances have led to amazing genetic insights into Mendelian and multifactorial ocular diseases. We provide a perspective of the progress in gene discovery and discuss the implications. We believe that the time has come to redefine the goals and begin utilizing the genetic knowledge for clinical management and treatment design. The unbelievable opportunities now exist for those nimble enough to seize them. PMID:23713688

  2. Abiotic Stress Tolerance: From Gene Discovery in Model Organisms to Crop Improvement

    OpenAIRE

    Bressan, Ray; Bohnert, Hans; Zhu, Jian-Kang

    2009-01-01

    Productive and sustainable agriculture necessitates growing plants in sub-optimal environments with less input of precious resources such as fresh water. For a better understanding and rapid improvement of abiotic stress tolerance, it is important to link physiological and biochemical work to molecular studies in genetically tractable model organisms. With the use of several technologies for the discovery of stress tolerance genes and their appropriate alleles, transgenic approaches to improv...

  3. A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.

    Science.gov (United States)

    Kothari, Cartik R; Payne, Philip R O

    2015-01-01

    In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.

  4. Toward Discovery Support Systems: A Replication, Re-examination, and Extension of Swanson's Work on Literature-Based Discovery of a Connection between Raynaud's and Fish Oil.

    Science.gov (United States)

    Gordon, Michael D.; Lindsay, Robert K.

    1996-01-01

    Describes the development of computer-based searching methods to support literature-based discoveries in medical literature through a replication of the discovery of a connection between Raynaud's disease and dietary fish oil. Topics include the logic of literature-based discovery, information retrieval methods for text analysis, and statistics…

  5. Exome sequencing for gene discovery in lethal fetal disorders--harnessing the value of extreme phenotypes.

    Science.gov (United States)

    Filges, Isabel; Friedman, Jan M

    2015-10-01

    Massively parallel sequencing has revolutionized our understanding of Mendelian disorders, and many novel genes have been discovered to cause disease phenotypes when mutant. At the same time, next-generation sequencing approaches have enabled non-invasive prenatal testing of free fetal DNA in maternal blood. However, little attention has been paid to using whole exome and genome sequencing strategies for gene identification in fetal disorders that are lethal in utero, because they can appear to be sporadic and Mendelian inheritance may be missed. We present challenges and advantages of applying next-generation sequencing approaches to gene discovery in fetal malformation phenotypes and review recent successful discovery approaches. We discuss the implication and significance of recessive inheritance and cross-species phenotyping in fetal lethal conditions. Whole exome sequencing can be used in individual families with undiagnosed lethal congenital anomaly syndromes to discover causal mutations, provided that prior to data analysis, the fetal phenotype can be correlated to a particular developmental pathway in embryogenesis. Cross-species phenotyping allows providing further evidence for causality of discovered variants in genes involved in those extremely rare phenotypes and will increase our knowledge about normal and abnormal human developmental processes. Ultimately, families will benefit from the option of early prenatal diagnosis. © 2014 John Wiley & Sons, Ltd.

  6. Discovery based and targeted Mass Spectrometry in farm animal proteomics

    DEFF Research Database (Denmark)

    Bendixen, Emøke

    2013-01-01

    for investigating farm animal biology. SRM is particularly important for validation biomarker candidates This talk will introduce the use of different mass spectrometry approaches through examples related to food quality and animal welfare, including studies of gut health in pigs, host pathogen interactions......Technological advances in mass spectrometry have greatly improved accuracy and speed of analyses of proteins and biochemical pathways. These proteome technologies have transformed research and diagnostic methods in the biomedical fields, and in food and farm animal sciences proteomics can be used...... be monitored to improve welfare in large industrial settings of current livestock industry. The combination of discovery based LC-MS/MS methods and the more hypothesis-based targeted mass spectrometry method commonly referred to as selected reaction monitoring or SRM, provide a powerful approach...

  7. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  8. Strategies for exome and genome sequence data analysis in disease-gene discovery projects.

    Science.gov (United States)

    Robinson, Peter N; Krawitz, P; Mundlos, S

    2011-08-01

    In whole-exome sequencing (WES), target capture methods are used to enrich the sequences of the coding regions of genes from fragmented total genomic DNA, followed by massively parallel, 'next-generation' sequencing of the captured fragments. Since its introduction in 2009, WES has been successfully used in several disease-gene discovery projects, but the analysis of whole-exome sequence data can be challenging. In this overview, we present a summary of the main computational strategies that have been applied to identify novel disease genes in whole-exome data, including intersect filters, the search for de novo mutations, and the application of linkage mapping or inference of identity-by-descent (IBD) in family studies. © 2011 John Wiley & Sons A/S.

  9. Comparing gene discovery from Affymetrix GeneChip microarrays and Clontech PCR-select cDNA subtraction: a case study

    Directory of Open Access Journals (Sweden)

    Wright Paul S

    2004-04-01

    Full Text Available Abstract Background Several high throughput technologies have been employed to identify differentially regulated genes that may be molecular targets for drug discovery. Here we compared the sets of differentially regulated genes discovered using two experimental approaches: a subtracted suppressive hybridization (SSH cDNA library methodology and Affymetrix GeneChip® technology. In this "case study" we explored the transcriptional pattern changes during the in vitro differentiation of human monocytes to myeloid dendritic cells (DC, and evaluated the potential for novel gene discovery using the SSH methodology. Results The same RNA samples isolated from peripheral blood monocyte precursors and immature DC (iDC were used for GeneChip microarray probing and SSH cDNA library construction. 10,000 clones from each of the two-way SSH libraries (iDC-monocytes and monocytes-iDC were picked for sequencing. About 2000 transcripts were identified for each library from 8000 successful sequences. Only 70% to 75% of these transcripts were represented on the U95 series GeneChip microarrays, implying that 25% to 30% of these transcripts might not have been identified in a study based only on GeneChip microarrays. In addition, about 10% of these transcripts appeared to be "novel", although these have not yet been closely examined. Among the transcripts that are also represented on the chips, about a third were concordantly discovered as differentially regulated between iDC and monocytes by GeneChip microarray transcript profiling. The remaining two thirds were either not inferred as differentially regulated from GeneChip microarray data, or were called differentially regulated but in the opposite direction. This underscores the importance both of generating reciprocal pairs of SSH libraries, and of real-time RT-PCR confirmation of the results. Conclusions This study suggests that SSH could be used as an alternative and complementary transcript profiling tool to

  10. Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics

    Science.gov (United States)

    2014-01-01

    Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353

  11. Emerging principles in protease-based drug discovery.

    Science.gov (United States)

    Drag, Marcin; Salvesen, Guy S

    2010-09-01

    Proteases have an important role in many signalling pathways, and represent potential drug targets for diseases ranging from cardiovascular disorders to cancer, as well as for combating many parasites and viruses. Although inhibitors of well-established protease targets such as angiotensin-converting enzyme and HIV protease have shown substantial therapeutic success, developing drugs for new protease targets has proved challenging in recent years. This in part could be due to issues such as the difficulty of achieving selectivity when targeting protease active sites. This Perspective discusses the general principles in protease-based drug discovery, highlighting the lessons learned and the emerging strategies, such as targeting allosteric sites, which could help harness the therapeutic potential of new protease targets.

  12. Application of nano-LC-based glycomics towards biomarker discovery.

    Science.gov (United States)

    Hua, Serenus; Lebrilla, Carlito; An, Hyun Joo

    2011-11-01

    The glycome, that is, the glycan components of a biological source, has been widely reported to change with disease states. However, mining the glycome for biomarkers is complicated by glycan structural heterogeneity. Nanoflow LC, or nano-LC, significantly addresses the problem by providing a highly sensitive and quantitative method of separating and profiling glycans. This review summarizes recent advances in analytical technology and methodology that enhance and augment the advantages offered by nano-LC. (e.g., reversed phase, hydrophilic interaction and porous graphitized carbon chromatography, as well as associated derivatization strategies), detectors (e.g., fluorescence and MS), and technology platforms (particularly chip-based nano-LC) are examined in detail, along with their application to biomarker discovery. Particular emphasis is placed on methods and technologies that allow structure-specific glycan profiling.

  13. Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems

    Directory of Open Access Journals (Sweden)

    Chung I-Fang

    2008-10-01

    Full Text Available Abstract Background The Signal-to-Noise-Ratio (SNR is often used for identification of biomarkers for two-class problems and no formal and useful generalization of SNR is available for multiclass problems. We propose innovative generalizations of SNR for multiclass cancer discrimination through introduction of two indices, Gene Dominant Index and Gene Dormant Index (GDIs. These two indices lead to the concepts of dominant and dormant genes with biological significance. We use these indices to develop methodologies for discovery of dominant and dormant biomarkers with interesting biological significance. The dominancy and dormancy of the identified biomarkers and their excellent discriminating power are also demonstrated pictorially using the scatterplot of individual gene and 2-D Sammon's projection of the selected set of genes. Using information from the literature we have shown that the GDI based method can identify dominant and dormant genes that play significant roles in cancer biology. These biomarkers are also used to design diagnostic prediction systems. Results and discussion To evaluate the effectiveness of the GDIs, we have used four multiclass cancer data sets (Small Round Blue Cell Tumors, Leukemia, Central Nervous System Tumors, and Lung Cancer. For each data set we demonstrate that the new indices can find biologically meaningful genes that can act as biomarkers. We then use six machine learning tools, Nearest Neighbor Classifier (NNC, Nearest Mean Classifier (NMC, Support Vector Machine (SVM classifier with linear kernel, and SVM classifier with Gaussian kernel, where both SVMs are used in conjunction with one-vs-all (OVA and one-vs-one (OVO strategies. We found GDIs to be very effective in identifying biomarkers with strong class specific signatures. With all six tools and for all data sets we could achieve better or comparable prediction accuracies usually with fewer marker genes than results reported in the literature using the

  14. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...

  15. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  16. An improved procedure for gene selection from microarray experiments using false discovery rate criterion

    Directory of Open Access Journals (Sweden)

    Yang Mark CK

    2006-01-01

    Full Text Available Abstract Background A large number of genes usually show differential expressions in a microarray experiment with two types of tissues, and the p-values of a proper statistical test are often used to quantify the significance of these differences. The genes with small p-values are then picked as the genes responsible for the differences in the tissue RNA expressions. One key question is what should be the threshold to consider the p-values small. There is always a trade off between this threshold and the rate of false claims. Recent statistical literature shows that the false discovery rate (FDR criterion is a powerful and reasonable criterion to pick those genes with differential expression. Moreover, the power of detection can be increased by knowing the number of non-differential expression genes. While this number is unknown in practice, there are methods to estimate it from data. The purpose of this paper is to present a new method of estimating this number and use it for the FDR procedure construction. Results A combination of test functions is used to estimate the number of differentially expressed genes. Simulation study shows that the proposed method has a higher power to detect these genes than other existing methods, while still keeping the FDR under control. The improvement can be substantial if the proportion of true differentially expressed genes is large. This procedure has also been tested with good results using a real dataset. Conclusion For a given expected FDR, the method proposed in this paper has better power to pick genes that show differentiation in their expression than two other well known methods.

  17. Cancer Biomarker Discovery: Lectin-Based Strategies Targeting Glycoproteins

    Directory of Open Access Journals (Sweden)

    David Clark

    2012-01-01

    Full Text Available Biomarker discovery can identify molecular markers in various cancers that can be used for detection, screening, diagnosis, and monitoring of disease progression. Lectin-affinity is a technique that can be used for the enrichment of glycoproteins from a complex sample, facilitating the discovery of novel cancer biomarkers associated with a disease state.

  18. Context-aware, ontology-based, service discovery

    NARCIS (Netherlands)

    Broens, T.H.F.; Pokraev, S.; van Sinderen, Marten J.; Koolwaaij, Johan; Dockhorn Costa, P.; Markopoulos, Panos; Eggen, Berry; Aarts, Emile; Crowley, James L.

    2004-01-01

    Service discovery is a process of locating, or discovering, one or more documents, that describe a particular service. Most of the current service discovery approaches perform syntactic matching, that is, they retrieve services descriptions that contain particular keywords from the user’s query.

  19. TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

  20. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  1. Leveraging gene-environment interactions and endotypes for asthma gene discovery

    DEFF Research Database (Denmark)

    Bønnelykke, Klaus; Ober, Carole

    2016-01-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also...... heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes......, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs...

  2. Proxy-Based IPv6 Neighbor Discovery Scheme for Wireless LAN Based Mesh Networks

    Science.gov (United States)

    Lee, Jihoon; Jeon, Seungwoo; Kim, Jaehoon

    Multi-hop Wireless LAN-based mesh network (WMN) provides high capacity and self-configuring capabilities. Due to data forwarding and path selection based on MAC address, WMN requires additional operations to achieve global connectivity using IPv6 address. The neighbor discovery operation over WLAN mesh networks requires repeated all-node broadcasting and this gives rise to a big burden in the entire mesh networks. In this letter, we propose the proxy neighbor discovery scheme for optimized IPv6 communication over WMN to reduce network overhead and communication latency. Using simulation experiments, we show that the control overhead and communication setup latency can be significantly reduced using the proxy-based neighbor discovery mechanism.

  3. Catecholamine receptors: prototypes for GPCR-based drug discovery.

    Science.gov (United States)

    Emery, Andrew C

    2013-01-01

    Drugs acting at G protein-coupled receptors (GPCRs) constitute ~40% of those in current clinical use. GPCR-based drug discovery remains at the forefront of drug development, especially for new treatments for psychiatric illness and neurological disease. Here, the basic framework of GPCR signaling learned through the elucidation of catecholamine receptor signaling through G proteins and β-arrestins, and X-ray crystallographic structure determination is reviewed. In silico docking studies developed in tandem with confirmatory empirical data gathering from binding and signaling experiments have allowed this basic framework to be expanded to drug hunting through predictive in silico searching as well as high-throughput and high-content screening approaches. For efforts moving forward for the deployment of new GPCR-acting drugs, collaborative efforts between industry and government/academic research in target validation at the molecular and cellular levels have become progressively more common. Polypharmacological approaches have become increasingly available for learning more about the mechanisms of GPCR-targeted drugs, based on interaction not with a single, but with a wide range of GPCR targets. These approaches are likely to aid in drug repurposing efforts, yield valuable insight on the side effects of currently employed drugs, and allow for a clearer picture of the actual targets of "atypical" drugs used in a variety of therapeutic contexts. © 2013 Elsevier Inc. All rights reserved.

  4. Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes

    Directory of Open Access Journals (Sweden)

    Goddard Mike E

    2004-03-01

    Full Text Available Abstract The ordinary-, penalized-, and bootstrap t-test, least squares and best linear unbiased prediction were compared for their false discovery rates (FDR, i.e. the fraction of falsely discovered genes, which was empirically estimated in a duplicate of the data set. The bootstrap-t-test yielded up to 80% lower FDRs than the alternative statistics, and its FDR was always as good as or better than any of the alternatives. Generally, the predicted FDR from the bootstrapped P-values agreed well with their empirical estimates, except when the number of mRNA samples is smaller than 16. In a cancer data set, the bootstrap-t-test discovered 200 differentially regulated genes at a FDR of 2.6%, and in a knock-out gene expression experiment 10 genes were discovered at a FDR of 3.2%. It is argued that, in the case of microarray data, control of the FDR takes sufficient account of the multiple testing, whilst being less stringent than Bonferoni-type multiple testing corrections. Extensions of the bootstrap simulations to more complicated test-statistics are discussed.

  5. Targeted SNP discovery in Atlantic salmon (Salmo salar genes using a 3'UTR-primed SNP detection approach

    Directory of Open Access Journals (Sweden)

    Høyheim Bjørn

    2010-12-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar has applied a global (genome-wide approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs. We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.

  6. Gene analysis techniques and susceptibility gene discovery in non-BRCA1/BRCA2 familial breast cancer.

    Science.gov (United States)

    Aloraifi, Fatima; Boland, Michael R; Green, Andrew J; Geraghty, James G

    2015-06-01

    Breast cancer is the leading cause of cancer deaths in females worldwide occurring in both hereditary and sporadic forms. Women with inherited pathogenic mutations in the BRCA1 or BRCA2 genes have up to an 85% risk of developing breast cancer in their lifetimes. These patients are candidates for risk-reduction measures such as intensive radiological screening, prophylactic surgery or chemoprevention. However, only about 20% of familial breast cancer cases are attributed to mutations in BRCA1 and BRCA2, while a further 5-10% are attributed to mutations in other rare susceptibility genes such as TP53, STK11, PTEN, ATM and CHEK2. A multitude of genome wide association studies (GWAS) have been conducted confirming low-risk common variants associated with breast cancer in excess of 90 loci, which may contribute to a further 23% of the heritability. We currently find ourselves in "the next generation", with technologies offering deep sequencing at a fraction of the cost. Starting off primarily in a research setting, multi-gene panel testing is now utilized in the clinic to sequence multiple predisposing genes simultaneously (otherwise known as multi-gene panel testing). In this review, we focus on the hereditary breast cancer discoveries, techniques and the challenges we face in this complex disease, especially in the light of the vast amount of data we now have at hand. It has been 20 years since the first breast cancer susceptibility gene has been discovered and there has been substantial progress in unraveling the genetic component of the disease. However, hereditary breast cancer remains a challenging topic subject to common debate. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. A Relational Database for the Discovery of Genes Encoding Amino Acid Biosynthetic Enzymes in Pathogenic Fungi

    Directory of Open Access Journals (Sweden)

    Nicholas J. Talbot

    2006-04-01

    Full Text Available Fungal phytopathogens continue to cause major economic impact, either directly, through crop losses, or due to the costs of fungicide application. Attempts to understand these organisms are hampered by a lack of fungal genome sequence data. A need exists, however, to develop specific bioinformatics tools to collate and analyse the sequence data that currently is available. A web-accessible gene discovery database (http://cogeme.ex.ac.uk/biosynthesis.html was developed as a demonstration tool for the analysis of metabolic and signal transduction pathways in pathogenic fungi using incomplete gene inventories. Using Bayesian probability to analyse the currently available gene information from pathogenic fungi, we provide evidence that the obligate pathogen Blumeria graminis possesses all amino acid biosynthetic pathways found in free-living fungi, such as Saccharomyces cerevisiae. Phylogenetic analysis was also used to deduce a gene history of succinate-semialdehyde dehydrogenase, an enzyme in the glutamate and lysine biosynthesis pathways. The database provides a tool and methodology to researchers to direct experimentation towards predicting pathway conservation in pathogenic microorganisms.

  8. A hybrid computational method for the discovery of novel reproduction-related genes.

    Science.gov (United States)

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations.

  9. The Tripod for Bacterial Natural Product Discovery: Genome Mining, Silent Pathway Induction, and Mass Spectrometry-Based Molecular Networking.

    Science.gov (United States)

    Trivella, Daniela B B; de Felicio, Rafael

    2018-01-01

    Natural products are the richest source of chemical compounds for drug discovery. Particularly, bacterial secondary metabolites are in the spotlight due to advances in genome sequencing and mining, as well as for the potential of biosynthetic pathway manipulation to awake silent (cryptic) gene clusters under laboratory cultivation. Further progress in compound detection, such as the development of the tandem mass spectrometry (MS/MS) molecular networking approach, has contributed to the discovery of novel bacterial natural products. The latter can be applied directly to bacterial crude extracts for identifying and dereplicating known compounds, therefore assisting the prioritization of extracts containing novel natural products, for example. In our opinion, these three approaches-genome mining, silent pathway induction, and MS-based molecular networking-compose the tripod for modern bacterial natural product discovery and will be discussed in this perspective.

  10. Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery

    Directory of Open Access Journals (Sweden)

    Yongyao Jiang

    2016-04-01

    Full Text Available Big geospatial data are archived and made available through online web discovery and access. However, finding the right data for scientific research and application development is still a challenge. This paper aims to improve the data discovery by mining the user knowledge from log files. Specifically, user web session reconstruction is focused upon in this paper as a critical step for extracting usage patterns. However, reconstructing user sessions from raw web logs has always been difficult, as a session identifier tends to be missing in most data portals. To address this problem, we propose two session identification methods, including time-clustering-based and time-referrer-based methods. We also present the workflow of session reconstruction and discuss the approach of selecting appropriate thresholds for relevant steps in the workflow. The proposed session identification methods and workflow are proven to be able to extract data access patterns for further pattern analyses of user behavior and improvement of data discovery for more relevancy data ranking, suggestion, and navigation.

  11. Discovery of core biotic stress responsive genes in Arabidopsis by weighted gene co-expression network analysis.

    Science.gov (United States)

    Amrine, Katherine C H; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  12. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

    Directory of Open Access Journals (Sweden)

    Kai Hu

    Full Text Available Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

  13. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

    Science.gov (United States)

    Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505

  14. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

    Science.gov (United States)

    Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

  15. A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.

    Directory of Open Access Journals (Sweden)

    Srinivasan Ramamurthy

    2009-11-01

    candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species.

  16. Discovery and replication of gene influences on brain structure using LASSO regression

    Directory of Open Access Journals (Sweden)

    Omid eKohannim

    2012-08-01

    Full Text Available We implemented LASSO (least absolute shrinkage and selection operator regression to evaluate gene effects in genome-wide association studies (GWAS of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI. Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4 and CDH13. The top genes we identified with this method also displayed significant and widespread post-hoc effects on voxelwise, tensor-based morphometry (TBM maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8±2.2 SD years. In exploratory analyses, three selected SNPs in the MACROD2 gene were also significantly associated with performance intelligence quotient (PIQ. Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain.

  17. A new omics data resource of Pleurocybella porrigens for gene discovery.

    Directory of Open Access Journals (Sweden)

    Tomohiro Suzuki

    Full Text Available BACKGROUND: Pleurocybellaporrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P. porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P. porrigens and the related species, however, are not stored in the public database. To gain the omics data in P. porrigens, we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. METHODOLOGY/PRINCIPAL FINDINGS: Short read sequences of genomic DNAs and mRNAs in P. porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P. porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes. Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. CONCLUSIONS: Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P. porrigens, provided from this research, will give a new data resource for gene discovery in basidiomycetes.

  18. A New Omics Data Resource of Pleurocybella porrigens for Gene Discovery

    Science.gov (United States)

    Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu

    2013-01-01

    Background Pleurocybella porrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P . porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P . porrigens and the related species, however, are not stored in the public database. To gain the omics data in P . porrigens , we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P . porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P . porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P . porrigens , provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076

  19. Cluster-based service discovery for heterogeneous wireless sensor networks

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Scholten, Johan; Havinga, Paul J.M.; Hartel, Pieter H.

    2007-01-01

    We propose an energy-efficient service discovery protocol for heterogeneous wireless sensor networks. Our solution exploits a cluster overlay, where the clusterhead nodes form a distributed service registry. A service lookup results in visiting only the clusterhead nodes. We aim for minimizing the

  20. Evolving towards a human-cell based and multiscale approach to drug discovery for CNS disorders

    Directory of Open Access Journals (Sweden)

    Eric eSchadt

    2014-12-01

    Full Text Available A disruptive approach to therapeutic discovery and development is required in order to significantly improve the success rate of drug discovery for central nervous system (CNS disorders. In this review, we first assess the key factors contributing to the frequent clinical failures for novel drugs. Second, we discuss cancer translational research paradigms that addressed key issues in drug discovery and development and have resulted in delivering drugs with significantly improved outcomes for patients. Finally, we discuss two emerging technologies that could improve the success rate of CNS therapies: human induced pluripotent stem cell (hiPSC-based studies and multiscale biology models. Coincident with advances in cellular technologies that enable the generation of hiPSCs directly from patient blood or skin cells, together with methods to differentiate these hiPSC lines into specific neural cell types relevant to neurological disease, it is also now possible to combine data from large-scale forward genetics and post-mortem global epigenetic and expression studies in order to generate novel predictive models. The application of systems biology approaches to account for the multiscale nature of different data types, from genetic to molecular and cellular to clinical, can lead to new insights into human diseases that are emergent properties of biological networks, not the result of changes to single genes. Such studies have demonstrated the heterogeneity in etiological pathways and the need for studies on model systems that are patient-derived and thereby recapitulate neurological disease pathways with higher fidelity. In the context of two common and presumably representative neurological diseases, the neurodegenerative disease Alzheimer’s Disease (AD, and the psychiatric disorder schizophrenia (SZ, we propose the need for, and exemplify the impact of, a multiscale biology approach that can integrate panomic, clinical, imaging, and literature

  1. Using heuristics to facilitate experiental learning in a simulation-based discovery learning environment.

    NARCIS (Netherlands)

    Veermans, K.H.; Mason, L.; de Jong, Anthonius J.M.; Andreuzza, S.; Arfè, B.; van Joolingen, Wouter; del Favero, L.

    2003-01-01

    Learners are often reported to experience difficulties with simulation-based discovery learning. Heuristics for discovery learning (rules of thumb that guide decision-making) can help learners to overcome these difficulties. In addition, the heuristics themselves are open for transfer. One way to

  2. promoting self directed learning in simulation based discovery learning environments through intelligent support.

    NARCIS (Netherlands)

    Veermans, K.H.; de Jong, Anthonius J.M.; van Joolingen, Wouter

    2000-01-01

    Providing learners with computer-generated feedback on their learning process in simulationbased discovery environments cannot be based on a detailed model of the learning process due to the “open” character of discovery learning. This paper describes a method for generating adaptive feedback for

  3. Network-Guided Key Gene Discovery for a Given Cellular Process

    DEFF Research Database (Denmark)

    He, Feng Q; Ollert, Markus

    2018-01-01

    and the following-up network analysis, opens up new avenues to predict key genes driving a given biological process or cellular function. Here we review and compare the current approaches in predicting key genes, which have no chances to stand out by classic differential expression analysis, from gene......Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...

  4. Biomolecular Network-Based Synergistic Drug Combination Discovery

    Directory of Open Access Journals (Sweden)

    Xiangyi Li

    2016-01-01

    Full Text Available Drug combination is a powerful and promising approach for complex disease therapy such as cancer and cardiovascular disease. However, the number of synergistic drug combinations approved by the Food and Drug Administration is very small. To bridge the gap between urgent need and low yield, researchers have constructed various models to identify synergistic drug combinations. Among these models, biomolecular network-based model is outstanding because of its ability to reflect and illustrate the relationships among drugs, disease-related genes, therapeutic targets, and disease-specific signaling pathways as a system. In this review, we analyzed and classified models for synergistic drug combination prediction in recent decade according to their respective algorithms. Besides, we collected useful resources including databases and analysis tools for synergistic drug combination prediction. It should provide a quick resource for computational biologists who work with network medicine or synergistic drug combination designing.

  5. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    Science.gov (United States)

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q

    2016-09-02

    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  6. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy

    Science.gov (United States)

    2011-01-01

    Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting. PMID:21447194

  7. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy

    Directory of Open Access Journals (Sweden)

    Yu Robert K

    2011-03-01

    Full Text Available Abstract Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1 the Tol2 transposase (but not piggyBac is highly sensitive to molecular engineering; (2 the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3 a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4 piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5 only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting.

  8. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy.

    Science.gov (United States)

    Meir, Yaa-Jyuhn J; Weirauch, Matthew T; Yang, Herng-Shing; Chung, Pei-Cheng; Yu, Robert K; Wu, Sareina C-Y

    2011-03-30

    DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting.

  9. Cracking the regulatory code of biosynthetic gene clusters as a strategy for natural product discovery.

    Science.gov (United States)

    Rigali, Sébastien; Anderssen, Sinaeda; Naômé, Aymeric; van Wezel, Gilles P

    2018-01-05

    The World Health Organization (WHO) describes antibiotic resistance as "one of the biggest threats to global health, food security, and development today", as the number of multi- and pan-resistant bacteria is rising dangerously. Acquired resistance phenomena also impair antifungals, antivirals, anti-cancer drug therapy, while herbicide resistance in weeds threatens the crop industry. On the positive side, it is likely that the chemical space of natural products goes far beyond what has currently been discovered. This idea is fueled by genome sequencing of microorganisms which unveiled numerous so-called cryptic biosynthetic gene clusters (BGCs), many of which are transcriptionally silent under laboratory culture conditions, and by the fact that most bacteria cannot yet be cultivated in the laboratory. However, brute force antibiotic discovery does not yield the same results as it did in the past, and researchers have had to develop creative strategies in order to unravel the hidden potential of microorganisms such as Streptomyces and other antibiotic-producing microorganisms. Identifying the cis elements and their corresponding transcription factors(s) involved in the control of BGCs through bioinformatic approaches is a promising strategy. Theoretically, we are a few 'clicks' away from unveiling the culturing conditions or genetic changes needed to activate the production of cryptic metabolites or increase the production yield of known compounds to make them economically viable. In this opinion article, we describe and illustrate the idea beyond 'cracking' the regulatory code for natural product discovery, by presenting a series of proofs of concept, and discuss what still should be achieved to increase the rate of success of this strategy. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Konstantin Okonechnikov

    Full Text Available Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.

  11. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.

    Science.gov (United States)

    Okonechnikov, Konstantin; Imai-Matsushima, Aki; Paul, Lukas; Seitz, Alexander; Meyer, Thomas F; Garcia-Alcalde, Fernando

    2016-01-01

    Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.

  12. Semantic based cluster content discovery in description first clustering algorithm

    International Nuclear Information System (INIS)

    Khan, M.W.; Asif, H.M.S.

    2017-01-01

    In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm) is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing); an IR (Information Retrieval) technique for induction of meaningful labels for clusters and VSM (Vector Space Model) for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase. (author)

  13. Semantic Based Cluster Content Discovery in Description First Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    MUHAMMAD WASEEM KHAN

    2017-01-01

    Full Text Available In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing; an IR (Information Retrieval technique for induction of meaningful labels for clusters and VSM (Vector Space Model for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase.

  14. Discovery of Antibiotics-derived Polymers for Gene Delivery using Combinatorial Synthesis and Cheminformatics Modeling

    Science.gov (United States)

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D.; Ramos, James; Breneman, Curt M.; Rege, Kaushal

    2014-01-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and ‘building block’ polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. PMID:24331709

  15. Discovery of antibiotics-derived polymers for gene delivery using combinatorial synthesis and cheminformatics modeling.

    Science.gov (United States)

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D; Ramos, James; Breneman, Curt M; Rege, Kaushal

    2014-02-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and 'building block' polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Mass Spectrometry–Based Biomarker Discovery: Toward a Global Proteome Index of Individuality

    Science.gov (United States)

    Hawkridge, Adam M.; Muddiman, David C.

    2011-01-01

    Biomarker discovery and proteomics have become synonymous with mass spectrometry in recent years. Although this conflation is an injustice to the many essential biomolecular techniques widely used in biomarker-discovery platforms, it underscores the power and potential of contemporary mass spectrometry. Numerous novel and powerful technologies have been developed around mass spectrometry, proteomics, and biomarker discovery over the past 20 years to globally study complex proteomes (e.g., plasma). However, very few large-scale longitudinal studies have been carried out using these platforms to establish the analytical variability relative to true biological variability. The purpose of this review is not to cover exhaustively the applications of mass spectrometry to biomarker discovery, but rather to discuss the analytical methods and strategies that have been developed for mass spectrometry–based biomarker-discovery platforms and to place them in the context of the many challenges and opportunities yet to be addressed. PMID:20636062

  17. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082

  18. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data.

  19. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery.

    Science.gov (United States)

    Rungrat, Tepsuda; Awlia, Mariam; Brown, Tim; Cheng, Riyan; Sirault, Xavier; Fajkus, Jiri; Trtilek, Martin; Furbank, Bob; Badger, Murray; Tester, Mark; Pogson, Barry J; Borevitz, Justin O; Wilson, Pip

    2016-01-01

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana , with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana .

  20. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    KAUST Repository

    Rungrat, Tepsuda

    2016-09-09

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana.

  1. Natural and man-made V-gene repertoires for antibody discovery

    Science.gov (United States)

    Finlay, William J. J.; Almagro, Juan C.

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process. PMID:23162556

  2. Network-based approaches to climate knowledge discovery

    Science.gov (United States)

    Budich, Reinhard; Nyberg, Per; Weigel, Tobias

    2011-11-01

    Climate Knowledge Discovery Workshop; Hamburg, Germany, 30 March to 1 April 2011 Do complex networks combined with semantic Web technologies offer the next generation of solutions in climate science? To address this question, a first Climate Knowledge Discovery (CKD) Workshop, hosted by the German Climate Computing Center (Deutsches Klimarechenzentrum (DKRZ)), brought together climate and computer scientists from major American and European laboratories, data centers, and universities, as well as representatives from industry, the broader academic community, and the semantic Web communities. The participants, representing six countries, were concerned with large-scale Earth system modeling and computational data analysis. The motivation for the meeting was the growing problem that climate scientists generate data faster than it can be interpreted and the need to prepare for further exponential data increases. Current analysis approaches are focused primarily on traditional methods, which are best suited for large-scale phenomena and coarse-resolution data sets. The workshop focused on the open discussion of ideas and technologies to provide the next generation of solutions to cope with the increasing data volumes in climate science.

  3. Crystallographic analysis of TPP riboswitch binding by small-molecule ligands discovered through fragment-based drug discovery approaches.

    Science.gov (United States)

    Warner, Katherine Deigan; Ferré-D'Amaré, Adrian R

    2014-01-01

    Riboswitches are structured mRNA elements that regulate gene expression in response to metabolite or second-messenger binding and are promising targets for drug discovery. Fragment-based drug discovery methods have identified weakly binding small molecule "fragments" that bind a thiamine pyrophosphate (TPP) riboswitch. However, these fragments require substantial chemical elaboration into more potent, drug-like molecules. Structure determination of the fragments bound to the riboswitch is the necessary next step. In this chapter, we describe the methods for co-crystallization and structure determination of fragment-bound TPP riboswitch structures. We focus on considerations for screening crystallization conditions across multiple crystal forms and provide guidance for building the fragment into the refined crystallographic model. These methods are broadly applicable for crystallographic analyses of any small molecules that bind structured RNAs.

  4. Discovery and characterization of nutritionally regulated genes associated with muscle growth in Atlantic salmon.

    Science.gov (United States)

    Bower, Neil I; Johnston, Ian A

    2010-10-01

    A genomics approach was used to identify nutritionally regulated genes involved in growth of fast skeletal muscle in Atlantic salmon (Salmo salar L.). Forward and reverse subtractive cDNA libraries were prepared comparing fish with zero growth rates to fish growing rapidly. We produced 7,420 ESTs and assembled them into nonredundant clusters prior to annotation. Contigs representing 40 potentially unrecognized nutritionally responsive candidate genes were identified. Twenty-three of the subtractive library candidates were also differentially regulated by nutritional state in an independent fasting-refeeding experiment and their expression placed in the context of 26 genes with established roles in muscle growth regulation. The expression of these genes was also determined during the maturation of a primary myocyte culture, identifying 13 candidates from the subtractive cDNA libraries with putative roles in the myogenic program. During early stages of refeeding DNAJA4, HSPA1B, HSP90A, and CHAC1 expression increased, indicating activation of unfolded protein response pathways. Four genes were considered inhibitory to myogenesis based on their in vivo and in vitro expression profiles (CEBPD, ASB2, HSP30, novel transcript GE623928). Other genes showed increased expression with feeding and highest in vitro expression during the proliferative phase of the culture (FOXD1, DRG1) or as cells differentiated (SMYD1, RTN1, MID1IP1, HSP90A, novel transcript GE617747). The genes identified were associated with chromatin modification (SMYD1, RTN1), microtubule stabilization (MID1IP1), cell cycle regulation (FOXD1, CEBPD, DRG1), and negative regulation of signaling (ASB2) and may play a role in the stimulation of myogenesis during the transition from a catabolic to anabolic state in skeletal muscle.

  5. An Affinity Propagation-Based DNA Motif Discovery Algorithm

    Directory of Open Access Journals (Sweden)

    Chunxiao Sun

    2015-01-01

    Full Text Available The planted (l,d motif search (PMS is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  6. Location Discovery Based on Fuzzy Geometry in Passive Sensor Networks

    Directory of Open Access Journals (Sweden)

    Rui Wang

    2011-01-01

    Full Text Available Location discovery with uncertainty using passive sensor networks in the nation's power grid is known to be challenging, due to the massive scale and inherent complexity. For bearings-only target localization in passive sensor networks, the approach of fuzzy geometry is introduced to investigate the fuzzy measurability for a moving target in R2 space. The fuzzy analytical bias expressions and the geometrical constraints are derived for bearings-only target localization. The interplay between fuzzy geometry of target localization and the fuzzy estimation bias for the case of fuzzy linear observer trajectory is analyzed in detail in sensor networks, which can realize the 3-dimensional localization including fuzzy estimate position and velocity of the target by measuring the fuzzy azimuth angles at intervals of fixed time. Simulation results show that the resulting estimate position outperforms the traditional least squares approach for localization with uncertainty.

  7. Applications of fiber-optics-based nanosensors to drug discovery.

    Science.gov (United States)

    Vo-Dinh, Tuan; Scaffidi, Jonathan; Gregas, Molly; Zhang, Yan; Seewaldt, Victoria

    2009-08-01

    Fiber-optic nanosensors are fabricated by heating and pulling optical fibers to yield sub-micron diameter tips and have been used for in vitro analysis of individual living mammalian cells. Immobilization of bioreceptors (e.g., antibodies, peptides, DNA) selective to targeting analyte molecules of interest provides molecular specificity. Excitation light can be launched into the fiber, and the resulting evanescent field at the tip of the nanofiber can be used to excite target molecules bound to the bioreceptor molecules. The fluorescence or surface-enhanced Raman scattering produced by the analyte molecules is detected using an ultra-sensitive photodetector. This article provides an overview of the development and application of fiber-optic nanosensors for drug discovery. The nanosensors provide minimally invasive tools to probe subcellular compartments inside single living cells for health effect studies (e.g., detection of benzopyrene adducts) and medical applications (e.g., monitoring of apoptosis in cells treated with anticancer drugs).

  8. Discovery and optimization of peptide-based anti-cobratoxins

    DEFF Research Database (Denmark)

    Sola, M.; Laustsen, Andreas Hougaard; Johannesen, J.

    More than 5.5 million people per year are victims of snake envenomation, resulting in 125,000 deaths and 400,000amputations worldwide. Antivenoms are still produced by animal immunization procedures, and they areassociated with a high risk of severe adverse reactions. Alternatively, synthetic pep...... peptides may open the possibility for newtherapies with better efficacy and safety. Here, we report the discovery and optimization of a synthetic peptide directedagainst α-cobratoxin (α-CTX), the most toxic component of Monocled cobra (Naja kaouthia).......More than 5.5 million people per year are victims of snake envenomation, resulting in 125,000 deaths and 400,000amputations worldwide. Antivenoms are still produced by animal immunization procedures, and they areassociated with a high risk of severe adverse reactions. Alternatively, synthetic...

  9. Functional linkage between genes that regulate osmotic stress responses and multidrug resistance transporters: challenges and opportunities for antibiotic discovery.

    Science.gov (United States)

    Cohen, B Eleazar

    2014-01-01

    All cells need to protect themselves against the osmotic challenges of their environment by maintaining low permeability to ions across their cell membranes. This is a basic principle of cellular function, which is reflected in the interactions among ion transport and drug efflux genes that have arisen during cellular evolution. Thus, upon exposure to pore-forming antibiotics such as amphotericin B (AmB) or daptomycin (Dap), sensitive cells overexpress common resistance genes to protect themselves from added osmotic challenges. These genes share pathway interactions with the various types of multidrug resistance (MDR) transporter genes, which both preserve the native lipid membrane composition and at the same time eliminate disruptive hydrophobic molecules that partition excessively within the lipid bilayer. An increased understanding of the relationships between the genes (and their products) that regulate osmotic stress responses and MDR transporters will help to identify novel strategies and targets to overcome the current stalemate in drug discovery.

  10. Proposal and Evaluation of BLE Discovery Process Based on New Features of Bluetooth 5.0

    Directory of Open Access Journals (Sweden)

    Ángela Hernández-Solana

    2017-08-01

    Full Text Available The device discovery process is one of the most crucial aspects in real deployments of sensor networks. Recently, several works have analyzed the topic of Bluetooth Low Energy (BLE device discovery through analytical or simulation models limited to version 4.x. Non-connectable and non-scannable undirected advertising has been shown to be a reliable alternative for discovering a high number of devices in a relatively short time period. However, new features of Bluetooth 5.0 allow us to define a variant on the device discovery process, based on BLE scannable undirected advertising events, which results in higher discovering capacities and also lower power consumption. In order to characterize this new device discovery process, we experimentally model the real device behavior of BLE scannable undirected advertising events. Non-detection packet probability, discovery probability, and discovery latency for a varying number of devices and parameters are compared by simulations and experimental measurements. We demonstrate that our proposal outperforms previous works, diminishing the discovery time and increasing the potential user device density. A mathematical model is also developed in order to easily obtain a measure of the potential capacity in high density scenarios.

  11. Proposal and Evaluation of BLE Discovery Process Based on New Features of Bluetooth 5.0.

    Science.gov (United States)

    Hernández-Solana, Ángela; Perez-Diaz-de-Cerio, David; Valdovinos, Antonio; Valenzuela, Jose Luis

    2017-08-30

    The device discovery process is one of the most crucial aspects in real deployments of sensor networks. Recently, several works have analyzed the topic of Bluetooth Low Energy (BLE) device discovery through analytical or simulation models limited to version 4.x. Non-connectable and non-scannable undirected advertising has been shown to be a reliable alternative for discovering a high number of devices in a relatively short time period. However, new features of Bluetooth 5.0 allow us to define a variant on the device discovery process, based on BLE scannable undirected advertising events, which results in higher discovering capacities and also lower power consumption. In order to characterize this new device discovery process, we experimentally model the real device behavior of BLE scannable undirected advertising events. Non-detection packet probability, discovery probability, and discovery latency for a varying number of devices and parameters are compared by simulations and experimental measurements. We demonstrate that our proposal outperforms previous works, diminishing the discovery time and increasing the potential user device density. A mathematical model is also developed in order to easily obtain a measure of the potential capacity in high density scenarios.

  12. PaGenBase: a pattern gene database for the global and dynamic understanding of gene function.

    Directory of Open Access Journals (Sweden)

    Jian-Bo Pan

    Full Text Available Pattern genes are a group of genes that have a modularized expression behavior under serial physiological conditions. The identification of pattern genes will provide a path toward a global and dynamic understanding of gene functions and their roles in particular biological processes or events, such as development and pathogenesis. In this study, we present PaGenBase, a novel repository for the collection of tissue- and time-specific pattern genes, including specific genes, selective genes, housekeeping genes and repressed genes. The PaGenBase database is now freely accessible at http://bioinf.xmu.edu.cn/PaGenBase/. In the current version (PaGenBase 1.0, the database contains 906,599 pattern genes derived from the literature or from data mining of more than 1,145,277 gene expression profiles in 1,062 distinct samples collected from 11 model organisms. Four statistical parameters were used to quantitatively evaluate the pattern genes. Moreover, three methods (quick search, advanced search and browse were designed for rapid and customized data retrieval. The potential applications of PaGenBase are also briefly described. In summary, PaGenBase will serve as a resource for the global and dynamic understanding of gene function and will facilitate high-level investigations in a variety of fields, including the study of development, pathogenesis and novel drug discovery.

  13. De novo assembly and characterization of the transcriptome of broomcorn millet (Panicum miliaceum L. for gene discovery and marker development

    Directory of Open Access Journals (Sweden)

    Hong Yue

    2016-07-01

    Full Text Available Broomcorn millet (Panicum miliaceum L. is one of the world’s oldest cultivated cereals, which is well adapted to extreme environments such as drought, heat and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No.2 and Yumi No.3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO categories. Cluster of orthologous groups and kyoto encyclopedia of genes and genomes (KEGG analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes (DEGs were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of 4 selected transcripts were validated through quantitative reverse transcription PCR (qRT-PCR analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond.

  14. Discovery of Novel Proline-Based Neuropeptide FF Receptor Antagonists.

    Science.gov (United States)

    Nguyen, Thuy; Decker, Ann M; Langston, Tiffany L; Mathews, Kelly M; Siemian, Justin N; Li, Jun-Xu; Harris, Danni L; Runyon, Scott P; Zhang, Yanan

    2017-10-18

    The neuropeptide FF (NPFF) system has been implicated in a number of physiological processes including modulating the pharmacological activity of opioid analgesics and several other classes of drugs of abuse. In this study, we report the discovery of a novel proline scaffold with antagonistic activity at the NPFF receptors through a high throughput screening campaign using a functional calcium mobilization assay. Focused structure-activity relationship studies on the initial hit 1 have resulted in several analogs with calcium mobilization potencies in the submicromolar range and modest selectivity for the NPFF1 receptor. Affinities and potencies of these compounds were confirmed in radioligand binding and functional cAMP assays. Two compounds, 16 and 33, had good solubility and blood-brain barrier permeability that fall within the range of CNS permeant candidates without the liability of being a P-glycoprotein substrate. Finally, both compounds reversed fentanyl-induced hyperalgesia in rats when administered intraperitoneally. Together, these results point to the potential of these proline analogs as promising NPFF receptor antagonists.

  15. Population-based structural variation discovery with Hydra-Multi.

    Science.gov (United States)

    Lindberg, Michael R; Hall, Ira M; Quinlan, Aaron R

    2015-04-15

    Current strategies for SNP and INDEL discovery incorporate sequence alignments from multiple individuals to maximize sensitivity and specificity. It is widely accepted that this approach also improves structural variant (SV) detection. However, multisample SV analysis has been stymied by the fundamental difficulties of SV calling, e.g. library insert size variability, SV alignment signal integration and detecting long-range genomic rearrangements involving disjoint loci. Extant tools suffer from poor scalability, which limits the number of genomes that can be co-analyzed and complicates analysis workflows. We have developed an approach that enables multisample SV analysis in hundreds to thousands of human genomes using commodity hardware. Here, we describe Hydra-Multi and measure its accuracy, speed and scalability using publicly available datasets provided by The 1000 Genomes Project and by The Cancer Genome Atlas (TCGA). Hydra-Multi is written in C++ and is freely available at https://github.com/arq5x/Hydra. aaronquinlan@gmail.com or ihall@genome.wustl.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  16. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    Science.gov (United States)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  17. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Directory of Open Access Journals (Sweden)

    Adam Y Ye

    Full Text Available Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD (http://htd.cbi.pku.edu.cn. Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  18. Augmented Reality-Based Simulators as Discovery Learning Tools: An Empirical Study

    Science.gov (United States)

    Ibáñez, María-Blanca; Di-Serio, Ángela; Villarán-Molina, Diego; Delgado-Kloos, Carlos

    2015-01-01

    This paper reports empirical evidence on having students use AR-SaBEr, a simulation tool based on augmented reality (AR), to discover the basic principles of electricity through a series of experiments. AR-SaBEr was enhanced with knowledge-based support and inquiry-based scaffolding mechanisms, which proved useful for discovery learning in…

  19. Discovery of pyridine-based agrochemicals by using Intermediate Derivatization Methods.

    Science.gov (United States)

    Guan, Ai-Ying; Liu, Chang-Ling; Sun, Xu-Feng; Xie, Yong; Wang, Ming-An

    2016-02-01

    Pyridine-based compounds have been playing a crucial role as agrochemicals or pesticides including fungicides, insecticides/acaricides and herbicides, etc. Since most of the agrochemicals listed in the Pesticide Manual were discovered through screening programs that relied on trial-and-error testing and new agrochemical discovery is not benefiting as much from the in silico new chemical compound identification/discovery techniques used in pharmaceutical research, it has become more important to find new methods to enhance the efficiency of discovering novel lead compounds in the agrochemical field to shorten the time of research phases in order to meet changing market requirements. In this review, we selected 18 representative known agrochemicals containing a pyridine moiety and extrapolate their discovery from the perspective of Intermediate Derivatization Methods in the hope that this approach will have greater appeal to researchers engaged in the discovery of agrochemicals and/or pharmaceuticals. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Validating fragment-based drug discovery for biological RNAs: lead fragments bind and remodel the TPP riboswitch specifically.

    Science.gov (United States)

    Warner, Katherine Deigan; Homan, Philip; Weeks, Kevin M; Smith, Alison G; Abell, Chris; Ferré-D'Amaré, Adrian R

    2014-05-22

    Thiamine pyrophosphate (TPP) riboswitches regulate essential genes in bacteria by changing conformation upon binding intracellular TPP. Previous studies using fragment-based approaches identified small molecule "fragments" that bind this gene-regulatory mRNA domain. Crystallographic studies now show that, despite having micromolar Kds, four different fragments bind the TPP riboswitch site-specifically, occupying the pocket that recognizes the aminopyrimidine of TPP. Unexpectedly, the unoccupied site that would recognize the pyrophosphate of TPP rearranges into a structure distinct from that of the cognate complex. This idiosyncratic fragment-induced conformation, also characterized by small-angle X-ray scattering and chemical probing, represents a possible mechanism for adventitious ligand discrimination by the riboswitch, and suggests that off-pathway conformations of RNAs can be targeted for drug development. Our structures, together with previous screening studies, demonstrate the feasibility of fragment-based drug discovery against RNA targets. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Using Concepts in Literature-based Discovery: Simulating Swanson's Raynaud-Fish Oil and Migraine-Magnesium Discoveries.

    Science.gov (United States)

    Weeber, Marc; Klein, Henny; de Jong-van den Berg, Lolkje T. W.; Vos, Rein

    2001-01-01

    Proposes a two-step model of discovery in which new scientific hypotheses can be generated and subsequently tested. Applying advanced natural language processing techniques to find biomedical concepts in text, the model is implemented in a versatile interactive discovery support tool. This tool is used to successfully simulate Don R. Swanson's…

  2. Discovery of CTCF-sensitive Cis-spliced fusion RNAs between adjacent genes in human prostate cells.

    Science.gov (United States)

    Qin, Fujun; Song, Zhenguo; Babiceanu, Mihaela; Song, Yansu; Facemire, Loryn; Singh, Ritambhara; Adli, Mazhar; Li, Hui

    2015-02-01

    Genes or their encoded products are not expected to mingle with each other unless in some disease situations. In cancer, a frequent mechanism that can produce gene fusions is chromosomal rearrangement. However, recent discoveries of RNA trans-splicing and cis-splicing between adjacent genes (cis-SAGe) support for other mechanisms in generating fusion RNAs. In our transcriptome analyses of 28 prostate normal and cancer samples, 30% fusion RNAs on average are the transcripts that contain exons belonging to same-strand neighboring genes. These fusion RNAs may be the products of cis-SAGe, which was previously thought to be rare. To validate this finding and to better understand the phenomenon, we used LNCaP, a prostate cell line as a model, and identified 16 additional cis-SAGe events by silencing transcription factor CTCF and paired-end RNA sequencing. About half of the fusions are expressed at a significant level compared to their parental genes. Silencing one of the in-frame fusions resulted in reduced cell motility. Most out-of-frame fusions are likely to function as non-coding RNAs. The majority of the 16 fusions are also detected in other prostate cell lines, as well as in the 14 clinical prostate normal and cancer pairs. By studying the features associated with these fusions, we developed a set of rules: 1) the parental genes are same-strand-neighboring genes; 2) the distance between the genes is within 30kb; 3) the 5' genes are actively transcribing; and 4) the chimeras tend to have the second-to-last exon in the 5' genes joined to the second exon in the 3' genes. We then randomly selected 20 neighboring genes in the genome, and detected four fusion events using these rules in prostate cancer and non-cancerous cells. These results suggest that splicing between neighboring gene transcripts is a rather frequent phenomenon, and it is not a feature unique to cancer cells.

  3. Text mining-based in silico drug discovery in oral mucositis caused by high-dose cancer therapy.

    Science.gov (United States)

    Kirk, Jon; Shah, Nirav; Noll, Braxton; Stevens, Craig B; Lawler, Marshall; Mougeot, Farah B; Mougeot, Jean-Luc C

    2018-02-23

    Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways. OM and wound healing-associated genes were determined by text mining, and the intersection of the two gene sets was selected for gene ontology analysis using the GeneCodis program. Protein interaction network analysis was performed using STRING-db. Enriched gene sets belonging to the identified pathways were queried against the Drug-Gene Interaction database to find drug candidates for topical use in OM. Our analysis identified 447 genes common to both the "OM" and "wound healing" text mining concepts. Gene enrichment analysis yielded 20 genes representing six pathways and targetable by a total of 32 drugs which could possibly be formulated for topical application. A manual search on ClinicalTrials.gov confirmed no relevant pathway/drug candidate had been overlooked. Twenty-five of the 32 drugs can directly affect the PTGS2 (COX-2) pathway, the pathway that has been targeted in previous clinical trials with limited success. Drug discovery using in silico text mining and pathway analysis tools can facilitate the identification of existing drugs that have the potential of topical administration to improve OM treatment.

  4. Helping Students Understand Gene Regulation with Online Tools: A Review of MEME and Melina II, Motif Discovery Tools for Active Learning in Biology

    Directory of Open Access Journals (Sweden)

    David Treves

    2012-08-01

    Full Text Available Review of: MEME and Melina II, which are two free and easy-to-use online motif discovery tools that can be employed to actively engage students in learning about gene regulatory elements.

  5. Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation

    Directory of Open Access Journals (Sweden)

    Robin Stephane

    2009-03-01

    Full Text Available Abstract Background The use of current high-throughput genetic, genomic and post-genomic data leads to the simultaneous evaluation of a large number of statistical hypothesis and, at the same time, to the multiple-testing problem. As an alternative to the too conservative Family-Wise Error-Rate (FWER, the False Discovery Rate (FDR has appeared for the last ten years as more appropriate to handle this problem. However one drawback of FDR is related to a given rejection region for the considered statistics, attributing the same value to those that are close to the boundary and those that are not. As a result, the local FDR has been recently proposed to quantify the specific probability for a given null hypothesis to be true. Results In this context we present a semi-parametric approach based on kernel estimators which is applied to different high-throughput biological data such as patterns in DNA sequences, genes expression and genome-wide association studies. Conclusion The proposed method has the practical advantages, over existing approaches, to consider complex heterogeneities in the alternative hypothesis, to take into account prior information (from an expert judgment or previous studies by allowing a semi-supervised mode, and to deal with truncated distributions such as those obtained in Monte-Carlo simulations. This method has been implemented and is available through the R package kerfdr via the CRAN or at http://stat.genopole.cnrs.fr/software/kerfdr.

  6. EZH2: biology, disease, and structure-based drug discovery

    Science.gov (United States)

    Tan, Jin-zhi; Yan, Yan; Wang, Xiao-xi; Jiang, Yi; Xu, H Eric

    2014-01-01

    EZH2 is the catalytic subunit of the polycomb repressive complex 2 (PRC2), which is a highly conserved histone methyltransferase that methylates lysine 27 of histone 3. Overexpression of EZH2 has been found in a wide range of cancers, including those of the prostate and breast. In this review, we address the current understanding of the oncogenic role of EZH2, including its PRC2-dependent transcriptional repression and PRC2-independent gene activation. We also discuss the connections between EZH2 and other silencing enzymes, such as DNA methyltransferase and histone deacetylase. We comprehensively address the architecture of the PRC2 complex and the crucial roles of each subunit. Finally, we summarize new progress in developing EZH2 inhibitors, which could be a new epigenetic therapy for cancers. PMID:24362326

  7. Rediscovering Don Swanson: The Past, Present and Future of Literature-based Discovery

    Directory of Open Access Journals (Sweden)

    Neil R. Smalheiser

    2017-12-01

    Full Text Available Purpose: The late Don R. Swanson was well appreciated during his lifetime as Dean of the Graduate Library School at University of Chicago, as winner of the American Society for Information Science Award of Merit for 2000, and as author of many seminal articles. In this informal essay, I will give my personal perspective on Don’s contributions to science, and outline some current and future directions in literature-based discovery that are rooted in concepts that he developed. Design/methodology/approach: Personal recollections and literature review. Findings: The Swanson A-B-C model of literature-based discovery has been successfully used by laboratory investigators analyzing their findings and hypotheses. It continues to be a fertile area of research in a wide range of application areas including text mining, drug repurposing, studies of scientific innovation, knowledge discovery in databases, and bioinformatics. Recently, additional modes of discovery that do not follow the A-B-C model have also been proposed and explored (e.g. so-called storytelling, gaps, analogies, link prediction, negative consensus, outliers, and revival of neglected or discarded research questions. Research limitations: This paper reflects the opinions of the author and is not a comprehensive nor technically based review of literature-based discovery. Practical implications: The general scientific public is still not aware of the availability of tools for literature-based discovery. Our Arrowsmith project site maintains a suite of discovery tools that are free and open to the public (http://arrowsmith.psych.uic.edu, as does BITOLA which is maintained by Dmitar Hristovski (http:// http://ibmi.mf.uni-lj.si/bitola, and Epiphanet which is maintained by Trevor Cohen (http://epiphanet.uth.tmc.edu/. Bringing user-friendly tools to the public should be a high priority, since even more than advancing basic research in informatics, it is vital that we ensure that scientists

  8. Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry.

    Science.gov (United States)

    Böcker, Sebastian

    2007-01-15

    Single Nucleotide Polymorphisms (SNPs) are believed to contribute strongly to the genetic variability in living beings, and SNP and mutation discovery are of great interest in today's Life Sciences. A comparatively new method to discover such polymorphisms is based on base-specific cleavage, where resulting cleavage products are analyzed by mass spectrometry (MS). One particular advantage of this method is the possibility of multiplexing the biochemical reactions, i.e. examining multiple genomic regions in parallel. Simulations can help estimating the performance of a method for polymorphism discovery, and allow us to evaluate the influence of method parameters on the discovery rate, and also to investigate whether the method is well suited for a certain genomic region. We show how to efficiently conduct such simulations for polymorphism discovery using base-specific cleavage and MS. Simulating multiplexed polymorphism discovery leads us to the problem of uniformly drawing a multiplex. Given a multiset of natural numbers we want to uniformly draw a subset of fixed cardinality so that the elements sum up to some fixed total length. We show how to enumerate multiplex layouts using dynamic programming, which allows us to uniformly draw a multiplex.

  9. A P2P Service Discovery Strategy Based on Content Catalogues

    Directory of Open Access Journals (Sweden)

    Lican Huang

    2007-08-01

    Full Text Available This paper presents a framework for distributed service discovery based on VIRGO P2P technologies. The services are classified as multi-layer, hierarchical catalogue domains according to their contents. The service providers, which have their own service registries such as UDDIs, register the services they provide and establish a virtual tree in a VIRGO network according to the domain of their service. The service location done by the proposed strategy is effective and guaranteed. This paper also discusses the primary implementation of service discovery based on Tomcat/Axis and jUDDI.

  10. Structure-guided, target-based drug discovery - exploiting genome information from HIV to mycobacterial infections.

    Science.gov (United States)

    Malhotra, Sony; Thomas, Sherine E; Ochoa Montano, Bernardo; Blundell, Tom L

    The use of protein crystallography in structure-guided drug discovery allows identification of potential inhibitor-binding sites and optimisation of interactions of hits and lead compounds with a target protein. An early example of this approach was the use of the structure of HIV protease in designing AIDS antivirals. More recently, use of structure-guided design with fragment-based drug discovery, which reduces the size of screening libraries by decreasing complexity, has improved ligand efficiency in drug design. Here, we discuss the use of structure-guided target identification and lead optimisation using fragment-based approaches in the development of new antimicrobials for mycobacterial infections.

  11. Genome wide prediction of protein function via a generic knowledge discovery approach based on evidence integration

    Directory of Open Access Journals (Sweden)

    Li Yinghui

    2006-05-01

    Full Text Available Abstract Background The automation of many common molecular biology techniques has resulted in the accumulation of vast quantities of experimental data. One of the major challenges now facing researchers is how to process this data to yield useful information about a biological system (e.g. knowledge of genes and their products, and the biological roles of proteins, their molecular functions, localizations and interaction networks. We present a technique called Global Mapping of Unknown Proteins (GMUP which uses the Gene Ontology Index to relate diverse sources of experimental data by creation of an abstraction layer of evidence data. This abstraction layer is used as input to a neural network which, once trained, can be used to predict function from the evidence data of unannotated proteins. The method allows us to include almost any experimental data set related to protein function, which incorporates the Gene Ontology, to our evidence data in order to seek relationships between the different sets. Results We have demonstrated the capabilities of this method in two ways. We first collected various experimental datasets associated with yeast (Saccharomyces cerevisiae and applied the technique to a set of previously annotated open reading frames (ORFs. These ORFs were divided into training and test sets and were used to examine the accuracy of the predictions made by our method. Then we applied GMUP to previously un-annotated ORFs and made 1980, 836 and 1969 predictions corresponding to the GO Biological Process, Molecular Function and Cellular Component sub-categories respectively. We found that GMUP was particularly successful at predicting ORFs with functions associated with the ribonucleoprotein complex, protein metabolism and transportation. Conclusion This study presents a global and generic gene knowledge discovery approach based on evidence integration of various genome-scale data. It can be used to provide insight as to how certain

  12. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  13. Cultivation of hard-to-culture subsurface mercury-resistant bacteria and discovery of new merA gene sequences

    DEFF Research Database (Denmark)

    Rasmussen, L D; Zawadsky, C; Binnerup, S J

    2008-01-01

    different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial......Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach...... sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One...

  14. The long (and winding) road to gene discovery for canine hip dysplasia.

    Science.gov (United States)

    Zhu, Lan; Zhang, Zhiwu; Friedenberg, Steven; Jung, Seung-Woo; Phavaphutanon, Janjira; Vernier-Singer, Margaret; Corey, Elizabeth; Mateescu, Raluca; Dykes, Nathan; Sandler, Jody; Acland, Gregory; Lust, George; Todhunter, Rory

    2009-08-01

    Hip dysplasia is a common inherited trait of dogs that results in secondary osteoarthritis. In this article the methods used to uncover the mutations contributing to this condition are reviewed, beginning with hip phenotyping. Coarse, genome-wide, microsatellite-based screens of pedigrees of greyhounds and dysplastic Labrador retrievers were used to identify linked quantitative trait loci (QTL). Fine-mapping across two chromosomes (CFA11 and 29) was employed using single nucleotide polymorphism (SNP) genotyping. Power analyses and preferential selection of dogs for ongoing SNP-based genotyping is described with the aim of refining the QTL intervals to 1-2 megabases on these and several additional chromosomes prior to candidate gene screening. The review considers how a mutation or a genetic marker such as a SNP or haplotype of SNPs might be combined with pedigree and phenotype information to create a 'breeding value' that could improve the accuracy of predicting a dog's hip conformation.

  15. Carbohydrate-based vaccine adjuvants - discovery and development.

    Science.gov (United States)

    Hu, Jing; Qiu, Liying; Wang, Xiaoli; Zou, Xiaopeng; Lu, Mengji; Yin, Jian

    2015-10-01

    The addition of a suitable adjuvant to a vaccine can generate significant effective adaptive immune responses. There is an urgent need for the development of novel po7tent and safe adjuvants for human vaccines. Carbohydrate molecules are promising adjuvants for human vaccines due to their high biocompatibility and good tolerability in vivo. The present review covers a few promising carbohydrate-based adjuvants, lipopolysaccharide, trehalose-6,6'-dibehenate, QS-21 and inulin as examples, which have been extensively studied in human vaccines in a number of preclinical and clinical studies. The authors discuss the current status, applications and strategies of development of each adjuvant and different adjuvant formulation systems. This information gives insight regarding the exciting prospect in the field of carbohydrate-based adjuvant research. Carbohydrate-based adjuvants are promising candidates as an alternative to the Alum salts for human vaccines development. Furthermore, combining two or more adjuvants in one formulation is one of the effective strategies in adjuvant development. However, further research efforts are needed to study and develop novel adjuvants systems, which can be more stable, potent and safe. The development of synthetic carbohydrate chemistry can improve the study of carbohydrate-based adjuvants.

  16. Discovery of technical methanation catalysts based on computational screening

    DEFF Research Database (Denmark)

    Sehested, Jens; Larsen, Kasper Emil; Kustov, Arkadii

    2007-01-01

    Methanation is a classical reaction in heterogeneous catalysis and significant effort has been put into improving the industrially preferred nickel-based catalysts. Recently, a computational screening study showed that nickel-iron alloys should be more active than the pure nickel catalyst...

  17. Antibiotic discovery throughout the Small World Initiative: A molecular strategy to identify biosynthetic gene clusters involved in antagonistic activity.

    Science.gov (United States)

    Davis, Elizabeth; Sloan, Tyler; Aurelius, Krista; Barbour, Angela; Bodey, Elijah; Clark, Brigette; Dennis, Celeste; Drown, Rachel; Fleming, Megan; Humbert, Allison; Glasgo, Elizabeth; Kerns, Trent; Lingro, Kelly; McMillin, MacKenzie; Meyer, Aaron; Pope, Breanna; Stalevicz, April; Steffen, Brittney; Steindl, Austin; Williams, Carolyn; Wimberley, Carmen; Zenas, Robert; Butela, Kristen; Wildschutte, Hans

    2017-06-01

    The emergence of bacterial pathogens resistant to all known antibiotics is a global health crisis. Adding to this problem is that major pharmaceutical companies have shifted away from antibiotic discovery due to low profitability. As a result, the pipeline of new antibiotics is essentially dry and many bacteria now resist the effects of most commonly used drugs. To address this global health concern, citizen science through the Small World Initiative (SWI) was formed in 2012. As part of SWI, students isolate bacteria from their local environments, characterize the strains, and assay for antibiotic production. During the 2015 fall semester at Bowling Green State University, students isolated 77 soil-derived bacteria and genetically characterized strains using the 16S rRNA gene, identified strains exhibiting antagonistic activity, and performed an expanded SWI workflow using transposon mutagenesis to identify a biosynthetic gene cluster involved in toxigenic compound production. We identified one mutant with loss of antagonistic activity and through subsequent whole-genome sequencing and linker-mediated PCR identified a 24.9 kb biosynthetic gene locus likely involved in inhibitory activity in that mutant. Further assessment against human pathogens demonstrated the inhibition of Bacillus cereus, Listeria monocytogenes, and methicillin-resistant Staphylococcus aureus in the presence of this compound, thus supporting our molecular strategy as an effective research pipeline for SWI antibiotic discovery and genetic characterization. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  18. Discovery and design of carbohydrate-based therapeutics.

    Science.gov (United States)

    Cipolla, Laura; Araújo, Ana C; Bini, Davide; Gabrielli, Luca; Russo, Laura; Shaikh, Nasrin

    2010-08-01

    Till now, the importance of carbohydrates has been underscored, if compared with the two other major classes of biopolymers such as oligonucleotides and proteins. Recent advances in glycobiology and glycochemistry have imparted a strong interest in the study of this enormous family of biomolecules. Carbohydrates have been shown to be implicated in recognition processes, such as cell-cell adhesion, cell-extracellular matrix adhesion and cell-intruder recognition phenomena. In addition, carbohydrates are recognized as differentiation markers and as antigenic determinants. Due to their relevant biological role, carbohydrates are promising candidates for drug design and disease treatment. However, the growing number of human disorders known as congenital disorders of glycosylation that are being identified as resulting from abnormalities in glycan structures and protein glycosylation strongly indicates that a fast development of glycobiology, glycochemistry and glycomedicine is highly desirable. The topics give an overview of different approaches that have been used to date for the design of carbohydrate-based therapeutics; this includes the use of native synthetic carbohydrates, the use of carbohydrate mimics designed on the basis of their native counterpart, the use of carbohydrates as scaffolds and finally the design of glyco-fused therapeutics, one of the most recent approaches. The review covers mainly literature that has appeared since 2000, except for a few papers cited for historical reasons. The reader will gain an overview of the current strategies applied to the design of carbohydrate-based therapeutics; in particular, the advantages/disadvantages of different approaches are highlighted. The topic is presented in a general, basic manner and will hopefully be a useful resource for all readers who are not familiar with it. In addition, in order to stress the potentialities of carbohydrates, several examples of carbohydrate-based marketed therapeutics are given

  19. Knowledge discovery based on experiential learning corporate culture management

    Science.gov (United States)

    Tu, Kai-Jan

    2014-10-01

    A good corporate culture based on humanistic theory can make the enterprise's management very effective, all enterprise's members have strong cohesion and centripetal force. With experiential learning model, the enterprise can establish an enthusiastic learning spirit corporate culture, have innovation ability to gain the positive knowledge growth effect, and to meet the fierce global marketing competition. A case study on Trend's corporate culture can offer the proof of industry knowledge growth rate equation as the contribution to experiential learning corporate culture management.

  20. Solar Radiation Data Base for Nigeria | Chineke | Discovery and ...

    African Journals Online (AJOL)

    Solar Radiation Data Base for Nigeria. T C Chineke, J I Aina, S S Jagtap. Full Text: EMAIL FULL TEXT EMAIL FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT · http://dx.doi.org/10.4314/dai.v11i3.15556 · AJOL African Journals Online. HOW TO USE AJOL... for Researchers · for Librarians · for Authors · FAQ's ...

  1. Discovery of the porcine NGN3 gene and testing its endocrine function in the pig

    Science.gov (United States)

    Neurogenin 3 (NGN3) is a member of the basic helix-loop-helix transcription factor family. NGN3 is both necessary and sufficient to drive endocrine differentiation in the developing pancreas in mouse and humans. Until now, the sequence for NGN3 eluded discovery despite completion of the pig genome a...

  2. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Directory of Open Access Journals (Sweden)

    Juliette J Kahle

    2013-03-01

    Full Text Available Hematopoietic stem cells (HSCs are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  3. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes.

    Directory of Open Access Journals (Sweden)

    Ying Tong

    Full Text Available The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs.The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.. Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs and 1,699 simple sequence repeats (SSRs were compiled.Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research on bivalve

  4. Direct: Ontology based discovery of responsibility and causality in legal case descriptions

    NARCIS (Netherlands)

    Breuker, J.A.P.J.; Hoekstra, R.J.; Gordon, T.

    2004-01-01

    In this paper we present DIRECT, a system forautomatic discovery of responsibility and causal relations in legal case descriptions based on LRI-Core, a core ontology that covers the main concepts that are common to all legal domains. These domains have a predominant common-sense character - the law

  5. Infrared and Raman Spectroscopy: A Discovery-Based Activity for the General Chemistry Curriculum

    Science.gov (United States)

    Borgsmiller, Karen L.; O'Connell, Dylan J.; Klauenberg, Kathryn M.; Wilson, Peter M.; Stromberg, Christopher J.

    2012-01-01

    A discovery-based method is described for incorporating the concepts of IR and Raman spectroscopy into the general chemistry curriculum. Students use three sets of springs to model the properties of single, double, and triple covalent bonds. Then, Gaussian 03W molecular modeling software is used to illustrate the relationship between bond…

  6. Performance Evaluation of a Cluster-Based Service Discovery Protocol for Heterogeneous Wireless Sensor Networks

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Scholten, Johan; Havinga, Paul J.M.; Hartel, Pieter H.

    2006-01-01

    Abstract—This paper evaluates the performance in terms of resource consumption of a service discovery protocol proposed for heterogeneous Wireless Sensor Networks (WSNs). The protocol is based on a clustering structure, which facilitates the construction of a distributed directory. Nodes with higher

  7. Microwave-Assisted Esterification: A Discovery-Based Microscale Laboratory Experiment

    Science.gov (United States)

    Reilly, Maureen K.; King, Ryan P.; Wagner, Alexander J.; King, Susan M.

    2014-01-01

    An undergraduate organic chemistry laboratory experiment has been developed that features a discovery-based microscale Fischer esterification utilizing a microwave reactor. Students individually synthesize a unique ester from known sets of alcohols and carboxylic acids. Each student identifies the best reaction conditions given their particular…

  8. Combining human and machine expertise for self-directed learning in simulation-based discovery environments

    NARCIS (Netherlands)

    de Jong, Anthonius J.M.; van Joolingen, Wouter; Swaak, Janine; Veermans, K.H.; Limbach, R.; King, S.; Gureghian, D.

    1998-01-01

    SIMQUEST is an authoring system for designing and creating simulation-based learning environments. The special character of SIMQUEST learning environments is that they include cognitive support for learners which means that they provide learners with support in the discovery process. In SIMQUEST

  9. Can Invalid Bioactives Undermine Natural Product-Based Drug Discovery?

    Science.gov (United States)

    2015-01-01

    High-throughput biology has contributed a wealth of data on chemicals, including natural products (NPs). Recently, attention was drawn to certain, predominantly synthetic, compounds that are responsible for disproportionate percentages of hits but are false actives. Spurious bioassay interference led to their designation as pan-assay interference compounds (PAINS). NPs lack comparable scrutiny, which this study aims to rectify. Systematic mining of 80+ years of the phytochemistry and biology literature, using the NAPRALERT database, revealed that only 39 compounds represent the NPs most reported by occurrence, activity, and distinct activity. Over 50% are not explained by phenomena known for synthetic libraries, and all had manifold ascribed bioactivities, designating them as invalid metabolic panaceas (IMPs). Cumulative distributions of ∼200,000 NPs uncovered that NP research follows power-law characteristics typical for behavioral phenomena. Projection into occurrence–bioactivity–effort space produces the hyperbolic black hole of NPs, where IMPs populate the high-effort base. PMID:26505758

  10. FPGA-Based Pulse Parameter Discovery for Positron Emission Tomography.

    Science.gov (United States)

    Haselman, Michael; Hauck, Scott; Lewellen, Thomas K; Miyaoka, Robert S

    2009-10-24

    Modern Field Programmable Gate Arrays (FPGAs) are capable of performing complex digital signal processing algorithms with clock rates well above 100MHz. This, combined with FPGA's low expense and ease of use make them an ideal technology for a data acquisition system for a positron emission tomography (PET) scanner. The University of Washington is producing a series of high-resolution, small-animal PET scanners that utilize FPGAs as the core of the front-end electronics. For these next generation scanners, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilizes to add significant signal processing power to produce higher resolution images. In this paper we report how we utilize the reconfigurable property of an FPGA to self-calibrate itself to determine pulse parameters necessary for some of the pulse processing steps. Specifically, we show how the FPGA can generate a reference pulse based on actual pulse data instead of a model. We also report how other properties of the photodetector pulse (baseline, pulse length, average pulse energy and event triggers) can be determined automatically by the FPGA.

  11. Discovery of new Gyrase β inhibitors via structure based modeling.

    Science.gov (United States)

    Al-Nadaf, Afaf H; Salah, Sajeda A; Taha, Mutasem O

    2018-03-19

    Gyrase B is an essential enzyme in the prokaryotes which became an attractive target for antibacterial agents. In our study, we implemented a wide range of docking configurations to dock 120 inhibitors into the in the ATP- binding pocket of Gyrase B enzyme (PDB code: 4GEE). LigandFit docking engines and six scoring functions were utilized in the study. Furthermore, the ligands were docked in their ionized and unionized forms into the hydrous and anhydrous binding pocket. We used docking-based Comparative Intermolecular Contacts Analysis (db-CICA) which is a novel methodology to validate and identify the optimal docking configurations. Three docking configurations were found to achieve self-consistent db-CICA models. The resulting db-CICA models were used to construct corresponding pharmacophoric models that were used to screen the National Cancer Institute (NCI) list of compounds. In-vitro study represents antibacterial activities for twelve hit molecules with the most active having IC 50 of 20.9 μM. Copyright © 2018. Published by Elsevier Ltd.

  12. Two combinatorial optimization problems for SNP discovery using base-specific cleavage and mass spectrometry.

    Science.gov (United States)

    Chen, Xin; Wu, Qiong; Sun, Ruimin; Zhang, Louxin

    2012-01-01

    The discovery of single-nucleotide polymorphisms (SNPs) has important implications in a variety of genetic studies on human diseases and biological functions. One valuable approach proposed for SNP discovery is based on base-specific cleavage and mass spectrometry. However, it is still very challenging to achieve the full potential of this SNP discovery approach. In this study, we formulate two new combinatorial optimization problems. While both problems are aimed at reconstructing the sample sequence that would attain the minimum number of SNPs, they search over different candidate sequence spaces. The first problem, denoted as SNP - MSP, limits its search to sequences whose in silico predicted mass spectra have all their signals contained in the measured mass spectra. In contrast, the second problem, denoted as SNP - MSQ, limits its search to sequences whose in silico predicted mass spectra instead contain all the signals of the measured mass spectra. We present an exact dynamic programming algorithm for solving the SNP - MSP problem and also show that the SNP - MSQ problem is NP-hard by a reduction from a restricted variation of the 3-partition problem. We believe that an efficient solution to either problem above could offer a seamless integration of information in four complementary base-specific cleavage reactions, thereby improving the capability of the underlying biotechnology for sensitive and accurate SNP discovery.

  13. Discovery of a novel gene involved in autolysis of Clostridium cells.

    Science.gov (United States)

    Yang, Liejian; Bao, Guanhui; Zhu, Yan; Dong, Hongjun; Zhang, Yanping; Li, Yin

    2013-06-01

    Cell autolysis plays important physiological roles in the life cycle of clostridial cells. Understanding the genetic basis of the autolysis phenomenon of pathogenic Clostridium or solvent producing Clostridium cells might provide new insights into this important species. Genes that might be involved in autolysis of Clostridium acetobutylicum, a model clostridial species, were investigated in this study. Twelve putative autolysin genes were predicted in C. acetobutylicum DSM 1731 genome through bioinformatics analysis. Of these 12 genes, gene SMB_G3117 was selected for testing the in tracellular autolysin activity, growth profile, viable cell numbers, and cellular morphology. We found that overexpression of SMB_G3117 gene led to earlier ceased growth, significantly increased number of dead cells, and clear electrolucent cavities, while disruption of SMB_G3117 gene exhibited remarkably reduced intracellular autolysin activity. These results indicate that SMB_G3117 is a novel gene involved in cellular autolysis of C. acetobutylicum.

  14. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  15. Discovery and characterization of the first genuine avian leptin gene in the rock dove (Columba livia).

    Science.gov (United States)

    Friedman-Einat, Miriam; Cogburn, Larry A; Yosefi, Sara; Hen, Gideon; Shinder, Dmitry; Shirak, Andrey; Seroussi, Eyal

    2014-09-01

    Leptin, the key regulator of mammalian energy balance, has been at the center of a great controversy in avian biology for the last 15 years since initial reports of a putative leptin gene (LEP) in chickens. Here, we characterize a novel LEP in rock dove (Columba livia) with low similarity of the predicted protein sequence (30% identity, 47% similarity) to the human ortholog. Searching the Sequence-Read-Archive database revealed leptin transcripts, in the dove's liver, with 2 noncoding exons preceding 2 coding exons. This unusual 4-exon structure was validated by sequencing of a GC-rich product (76% GC, 721 bp) amplified from liver RNA by RT-PCR. Sequence alignment of the dove leptin with orthologous leptins indicated that it consists of a leader peptide (21 amino acids; aa) followed by the mature protein (160 aa), which has a putative structure typical of 4-helical-bundle cytokines except that it is 12 aa longer than human leptin. Extra residues (10 aa) were located within the loop between 2 5'-helices, interrupting the amino acid motif that is conserved in tetrapods and considered essential for activation of leptin receptor (LEPR) but not for receptor binding per se. Quantitative RT-PCR of 11 tissues showed highest (P < .05) expression of LEP in the dove's liver, whereas the dove LEPR peaked (P < .01) in the pituitary. Both genes were prominently expressed in the gonads and at lower levels in tissues involved in mammalian leptin signaling (adipose; hypothalamus). A bioassay based on activation of the chicken LEPR in vitro showed leptin activity in the dove's circulation, suggesting that dove LEP encodes an active protein, despite the interrupted loop motif. Providing tools to study energy-balance control at an evolutionary perspective, our original demonstration of leptin signaling in dove predicts a more ancient role of leptin in growth and reproduction in birds, rather than appetite control.

  16. Discovery by the Epistasis Project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer's disease

    NARCIS (Netherlands)

    J.M. Bullock (James); C. Medway (Christopher); M. Cortina-Borja (Mario); J.C. Turton (James); J.A. Prince (Jonathan); C.A. Ibrahim-Verbaas (Carla); M. Schuur (Maaike); M.M.B. Breteler (Monique); C.M. van Duijn (Cornelia); P.G. Kehoe (Patrick); R. Barber (Rachel); E. Coto (Eliecer); V. Alvarez (Victoria); P. Deloukas (Panagiotis); N. Hammond (Naomi); O. Combarros (Onofre); I. Mateo (Ignacio); D.R. Warden (Donald); M.G. Lehmann (Michael); O. Belbin (Olivia); K. Brown (Kristelle); G.K. Wilcock (Gordon); R. Heun (Reinhard); H. Kölsch (Heike); A.D. Smith; D.J. Lehmann (Donald); K. Morgan (Kevin)

    2013-01-01

    textabstractDespite recent discoveries in the genetics of sporadic Alzheimer's disease, there remains substantial " hidden heritability." It is thought that some of this missing heritability may be because of gene-gene, i.e., epistatic, interactions. We examined potential epistasis between 110

  17. Computational strategies for genome-based natural product discovery and engineering in fungi.

    Science.gov (United States)

    van der Lee, Theo A J; Medema, Marnix H

    2016-04-01

    Fungal natural products possess biological activities that are of great value to medicine, agriculture and manufacturing. Recent metagenomic studies accentuate the vastness of fungal taxonomic diversity, and the accompanying specialized metabolic diversity offers a great and still largely untapped resource for natural product discovery. Although fungal natural products show an impressive variation in chemical structures and biological activities, their biosynthetic pathways share a number of key characteristics. First, genes encoding successive steps of a biosynthetic pathway tend to be located adjacently on the chromosome in biosynthetic gene clusters (BGCs). Second, these BGCs are often are located on specific regions of the genome and show a discontinuous distribution among evolutionarily related species and isolates. Third, the same enzyme (super)families are often involved in the production of widely different compounds. Fourth, genes that function in the same pathway are often co-regulated, and therefore co-expressed across various growth conditions. In this mini-review, we describe how these partly interlinked characteristics can be exploited to computationally identify BGCs in fungal genomes and to connect them to their products. Particular attention will be given to novel algorithms to identify unusual classes of BGCs, as well as integrative pan-genomic approaches that use a combination of genomic and metabolomic data for parallelized natural product discovery across multiple strains. Such novel technologies will not only expedite the natural product discovery process, but will also allow the assembly of a high-quality toolbox for the re-design or even de novo design of biosynthetic pathways using synthetic biology approaches. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Improving Interpretation of Cardiac Phenotypes and Enhancing Discovery With Expanded Knowledge in the Gene Ontology.

    Science.gov (United States)

    Lovering, Ruth C; Roncaglia, Paola; Howe, Douglas G; Laulederkind, Stanley J F; Khodiyar, Varsha K; Berardini, Tanya Z; Tweedie, Susan; Foulger, Rebecca E; Osumi-Sutherland, David; Campbell, Nancy H; Huntley, Rachael P; Talmud, Philippa J; Blake, Judith A; Breckenridge, Ross; Riley, Paul R; Lambiase, Pier D; Elliott, Perry M; Clapp, Lucie; Tinker, Andrew; Hill, David P

    2018-02-01

    A systems biology approach to cardiac physiology requires a comprehensive representation of how coordinated processes operate in the heart, as well as the ability to interpret relevant transcriptomic and proteomic experiments. The Gene Ontology (GO) Consortium provides structured, controlled vocabularies of biological terms that can be used to summarize and analyze functional knowledge for gene products. In this study, we created a computational resource to facilitate genetic studies of cardiac physiology by integrating literature curation with attention to an improved and expanded ontological representation of heart processes in the Gene Ontology. As a result, the Gene Ontology now contains terms that comprehensively describe the roles of proteins in cardiac muscle cell action potential, electrical coupling, and the transmission of the electrical impulse from the sinoatrial node to the ventricles. Evaluating the effectiveness of this approach to inform data analysis demonstrated that Gene Ontology annotations, analyzed within an expanded ontological context of heart processes, can help to identify candidate genes associated with arrhythmic disease risk loci. We determined that a combination of curation and ontology development for heart-specific genes and processes supports the identification and downstream analysis of genes responsible for the spread of the cardiac action potential through the heart. Annotating these genes and processes in a structured format facilitates data analysis and supports effective retrieval of gene-centric information about cardiac defects. © 2018 The Authors.

  19. A hybrid network-based method for the detection of disease-related genes

    Science.gov (United States)

    Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

    2018-02-01

    Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

  20. Using Just-in-Time Information to Support Scientific Discovery Learning in a Computer-Based Simulation

    Science.gov (United States)

    Hulshof, Casper D.; de Jong, Ton

    2006-01-01

    Students encounter many obstacles during scientific discovery learning with computer-based simulations. It is hypothesized that an effective type of support, that does not interfere with the scientific discovery learning process, should be delivered on a "just-in-time" base. This study explores the effect of facilitating access to…

  1. Discovery of functional genes for systemic acquired resistance in Arabidopsis thaliana through integrated data mining.

    Science.gov (United States)

    Pan, Youlian; Pylatuik, Jeffrey D; Ouyang, Junjun; Famili, A Fazel; Fobert, Pierre R

    2004-12-01

    Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach.

  2. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    Science.gov (United States)

    Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...

  3. Thesaurus-based disambiguation of gene symbols

    Directory of Open Access Journals (Sweden)

    Wain Hester M

    2005-06-01

    Full Text Available Abstract Background Massive text mining of the biological literature holds great promise of relating disparate information and discovering new knowledge. However, disambiguation of gene symbols is a major bottleneck. Results We developed a simple thesaurus-based disambiguation algorithm that can operate with very little training data. The thesaurus comprises the information from five human genetic databases and MeSH. The extent of the homonym problem for human gene symbols is shown to be substantial (33% of the genes in our combined thesaurus had one or more ambiguous symbols, not only because one symbol can refer to multiple genes, but also because a gene symbol can have many non-gene meanings. A test set of 52,529 Medline abstracts, containing 690 ambiguous human gene symbols taken from OMIM, was automatically generated. Overall accuracy of the disambiguation algorithm was up to 92.7% on the test set. Conclusion The ambiguity of human gene symbols is substantial, not only because one symbol may denote multiple genes but particularly because many symbols have other, non-gene meanings. The proposed disambiguation approach resolves most ambiguities in our test set with high accuracy, including the important gene/not a gene decisions. The algorithm is fast and scalable, enabling gene-symbol disambiguation in massive text mining applications.

  4. Comparison of Mathematical Resilience among Students with Problem Based Learning and Guided Discovery Learning Model

    Science.gov (United States)

    Hafiz, M.; Darhim; Dahlan, J. A.

    2017-09-01

    Mathematical resilience is very important thing in learning mathematics. It is a positive attitude in order to make student not easily give up in the face of adversity when solving mathematics problems through discussion and research about mathematics. The purpose of this study was to examine comparison of mathematical resilience among students receiving problem based learning model and the students who received guided discovery learning model. This research was conducted at one junior high school in Jakarta. The method was used in this study is quasi-experimental with 66 students as the samples. The instrument which was used in this research is mathematical resilience scale with 24 items of statements. The result of this research is mathematical resilience between the students who received problem based learning model is better than the students who received guided discovery learning model. According to this study result the authors presented some suggestions that: 1) problem based learning and guided discovery learning model can both develop mathematical resilience, but problem based learning is more recommended to use, 2) in order to achieve mathematical resilience better than this findings, it needs to do the next research that combine problem based learning with other treatment.

  5. Mass Spectrometry-Based Proteomics in Molecular Diagnostics: Discovery of Cancer Biomarkers Using Tissue Culture

    Science.gov (United States)

    Paul, Debasish; Kumar, Avinash; Gajbhiye, Akshada; Santra, Manas K.; Srikanth, Rapole

    2013-01-01

    Accurate diagnosis and proper monitoring of cancer patients remain a key obstacle for successful cancer treatment and prevention. Therein comes the need for biomarker discovery, which is crucial to the current oncological and other clinical practices having the potential to impact the diagnosis and prognosis. In fact, most of the biomarkers have been discovered utilizing the proteomics-based approaches. Although high-throughput mass spectrometry-based proteomic approaches like SILAC, 2D-DIGE, and iTRAQ are filling up the pitfalls of the conventional techniques, still serum proteomics importunately poses hurdle in overcoming a wide range of protein concentrations, and also the availability of patient tissue samples is a limitation for the biomarker discovery. Thus, researchers have looked for alternatives, and profiling of candidate biomarkers through tissue culture of tumor cell lines comes up as a promising option. It is a rich source of tumor cell-derived proteins, thereby, representing a wide array of potential biomarkers. Interestingly, most of the clinical biomarkers in use today (CA 125, CA 15.3, CA 19.9, and PSA) were discovered through tissue culture-based system and tissue extracts. This paper tries to emphasize the tissue culture-based discovery of candidate biomarkers through various mass spectrometry-based proteomic approaches. PMID:23586059

  6. Mass Spectrometry-Based Proteomics in Molecular Diagnostics: Discovery of Cancer Biomarkers Using Tissue Culture

    Directory of Open Access Journals (Sweden)

    Debasish Paul

    2013-01-01

    Full Text Available Accurate diagnosis and proper monitoring of cancer patients remain a key obstacle for successful cancer treatment and prevention. Therein comes the need for biomarker discovery, which is crucial to the current oncological and other clinical practices having the potential to impact the diagnosis and prognosis. In fact, most of the biomarkers have been discovered utilizing the proteomics-based approaches. Although high-throughput mass spectrometry-based proteomic approaches like SILAC, 2D-DIGE, and iTRAQ are filling up the pitfalls of the conventional techniques, still serum proteomics importunately poses hurdle in overcoming a wide range of protein concentrations, and also the availability of patient tissue samples is a limitation for the biomarker discovery. Thus, researchers have looked for alternatives, and profiling of candidate biomarkers through tissue culture of tumor cell lines comes up as a promising option. It is a rich source of tumor cell-derived proteins, thereby, representing a wide array of potential biomarkers. Interestingly, most of the clinical biomarkers in use today (CA 125, CA 15.3, CA 19.9, and PSA were discovered through tissue culture-based system and tissue extracts. This paper tries to emphasize the tissue culture-based discovery of candidate biomarkers through various mass spectrometry-based proteomic approaches.

  7. The Utility of Next-Generation Sequencing in Gene Discovery for Mutation-Negative Patients with Rett Syndrome

    Science.gov (United States)

    Gold, Wendy Anne; Christodoulou, John

    2015-01-01

    Rett syndrome (RTT) is a rare, severe disorder of neuronal plasticity that predominantly affects girls. Girls with RTT usually appear asymptomatic in the first 6–18 months of life, but gradually develop severe motor, cognitive, and behavioral abnormalities that persist for life. A predominance of neuronal and synaptic dysfunction, with altered excitatory–inhibitory neuronal synaptic transmission and synaptic plasticity, are overarching features of RTT in children and in mouse models. Over 90% of patients with classical RTT have mutations in the X-linked methyl-CpG-binding (MECP2) gene, while other genes, including cyclin-dependent kinase-like 5 (CDKL5), Forkhead box protein G1 (FOXG1), myocyte-specific enhancer factor 2C (MEF2C), and transcription factor 4 (TCF4), have been associated with phenotypes overlapping with RTT. However, there remain a proportion of patients who carry a clinical diagnosis of RTT, but who are mutation negative. In recent years, next-generation sequencing technologies have revolutionized approaches to genetic studies, making whole-exome and even whole-genome sequencing possible strategies for the detection of rare and de novo mutations, aiding the discovery of novel disease genes. Here, we review the recent progress that is emerging in identifying pathogenic variations, specifically from exome sequencing in RTT patients, and emphasize the need for the use of this technology to identify known and new disease genes in RTT patients. PMID:26236194

  8. Exploring the role of receptor flexibility in structure-based drug discovery.

    Science.gov (United States)

    Feixas, Ferran; Lindert, Steffen; Sinko, William; McCammon, J Andrew

    2014-02-01

    The proper understanding of biomolecular recognition mechanisms that take place in a drug target is of paramount importance to improve the efficiency of drug discovery and development. The intrinsic dynamic character of proteins has a strong influence on biomolecular recognition mechanisms and models such as conformational selection have been widely used to account for this dynamic association process. However, conformational changes occurring in the receptor prior and upon association with other molecules are diverse and not obvious to predict when only a few structures of the receptor are available. In view of the prominent role of protein flexibility in ligand binding and its implications for drug discovery, it is of great interest to identify receptor conformations that play a major role in biomolecular recognition before starting rational drug design efforts. In this review, we discuss a number of recent advances in computer-aided drug discovery techniques that have been proposed to incorporate receptor flexibility into structure-based drug design. The allowance for receptor flexibility provided by computational techniques such as molecular dynamics simulations or enhanced sampling techniques helps to improve the accuracy of methods used to estimate binding affinities and, thus, such methods can contribute to the discovery of novel drug leads. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Discovery of Unusual Biaryl Polyketides by Activation of a Silent Streptomyces venezuelae Biosynthetic Gene Cluster.

    Science.gov (United States)

    Thanapipatsiri, Anyarat; Gomez-Escribano, Juan Pablo; Song, Lijiang; Bibb, Maureen J; Al-Bassam, Mahmoud; Chandra, Govind; Thamchaipenet, Arinthip; Challis, Gregory L; Bibb, Mervyn J

    2016-11-17

    Comparative transcriptional profiling of a ΔbldM mutant of Streptomyces venezuelae with its unmodified progenitor revealed that the expression of a cryptic biosynthetic gene cluster containing both type I and type III polyketide synthase genes is activated in the mutant. The 29.5 kb gene cluster, which was predicted to encode an unusual biaryl metabolite, which we named venemycin, and potentially halogenated derivatives, contains 16 genes including one-vemR-that encodes a transcriptional activator of the large ATP-binding LuxR-like (LAL) family. Constitutive expression of vemR in the ΔbldM mutant led to the production of sufficient venemycin for structural characterisation, confirming its unusual biaryl structure. Co-expression of the venemycin biosynthetic gene cluster and vemR in the heterologous host Streptomyces coelicolor also resulted in venemycin production. Although the gene cluster encodes two halogenases and a flavin reductase, constitutive expression of all three genes led to the accumulation only of a monohalogenated venemycin derivative, both in the native producer and the heterologous host. A competition experiment in which equimolar quantities of sodium chloride and sodium bromide were fed to the venemycin-producing strains resulted in the preferential incorporation of bromine, thus suggesting that bromide is the preferred substrate for one or both halogenases. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  10. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    Directory of Open Access Journals (Sweden)

    Robertson Gordon

    2010-10-01

    Full Text Available Abstract Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE. Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.

  11. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach

    Directory of Open Access Journals (Sweden)

    Chung-Ming Own

    2015-09-01

    Full Text Available Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS, which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  12. Harvest: a web-based biomedical data discovery and reporting application development platform.

    Science.gov (United States)

    Italia, Michael J; Pennington, Jeffrey W; Ruth, Byron; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; Miller, Jeffrey; White, Peter S

    2013-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible. This need is increasingly acute as investigators seek opportunities for discovery amidst an exponential growth in the volume and complexity of laboratory and clinical data. To address this need, we developed Harvest, an open source framework that provides a set of modular components to aid the rapid development and deployment of custom data discovery software applications. Harvest incorporates visual representations of multidimensional data types in an intuitive, web-based interface that promotes a real-time, iterative approach to exploring complex clinical and experimental data. The Harvest architecture capitalizes on standards-based, open source technologies to address multiple functional needs critical to a research and development environment, including domain-specific data modeling, abstraction of complex data models, and a customizable web client.

  13. Analysis of cassava (Manihot esculenta) ESTs: A tool for the discovery of genes

    International Nuclear Information System (INIS)

    Zapata, Andres; Neme, Rafik; Sanabria, Carolina; Lopez, Camilo

    2011-01-01

    Cassava (Manihot esculenta) is the main source of calories for more than 1,000 millions of people around the world and has been consolidated as the fourth most important crop after rice, corn and wheat. Cassava is considered tolerant to abiotic and biotic stress conditions; nevertheless these characteristics are mainly present in non-commercial varieties. Genetic breeding strategies represent an alternative to introduce the desirable characteristics into commercial varieties. A fundamental step for accelerating the genetic breeding process in cassava requires the identification of genes associated to these characteristics. One rapid strategy for the identification of genes is the possibility to have a large collection of ESTs (expressed sequence tag). In this study, a complete analysis of cassava ESTs was done. The cassava ESTs represent 80,459 sequences which were assembled in a set of 29,231 unique genes (unigen), comprising 10,945 contigs and 18,286 singletones. These 29,231 unique genes represent about 80% of the genes of the cassava's genome. Between 5% and 10% of the unigenes of cassava not show similarity to any sequences present in the NCBI database and could be consider as cassava specific genes. a functional category was assigned to a group of sequences of the unigen set (29%) following the Gene Ontology Vocabulary. the molecular function component was the best represented with 43% of the sequences, followed by the biological process component (38%) and finally the cellular component with 19%. in the cassava ESTs collection, 3,709 microsatellites were identified and they could be used as molecular markers. this study represents an important contribution to the knowledge of the functional genomic structure of cassava and constitutes an important tool for the identification of genes associated to agricultural characteristics of interest that could be employed in cassava breeding programs.

  14. Discovery and characterization of two new stem rust resistance genes in Aegilops sharonensis.

    Science.gov (United States)

    Yu, Guotai; Champouret, Nicolas; Steuernagel, Burkhard; Olivera, Pablo D; Simmons, Jamie; Williams, Cole; Johnson, Ryan; Moscou, Matthew J; Hernández-Pinzón, Inmaculada; Green, Phon; Sela, Hanan; Millet, Eitan; Jones, Jonathan D G; Ward, Eric R; Steffenson, Brian J; Wulff, Brande B H

    2017-06-01

    We identified two novel wheat stem rust resistance genes, Sr-1644-1Sh and Sr-1644-5Sh in Aegilops sharonensis that are effective against widely virulent African races of the wheat stem rust pathogen. Stem rust is one of the most important diseases of wheat in the world. When single stem rust resistance (Sr) genes are deployed in wheat, they are often rapidly overcome by the pathogen. To this end, we initiated a search for novel sources of resistance in diverse wheat relatives and identified the wild goatgrass species Aegilops sharonesis (Sharon goatgrass) as a rich reservoir of resistance to wheat stem rust. The objectives of this study were to discover and map novel Sr genes in Ae. sharonensis and to explore the possibility of identifying new Sr genes by genome-wide association study (GWAS). We developed two biparental populations between resistant and susceptible accessions of Ae. sharonensis and performed QTL and linkage analysis. In an F 6 recombinant inbred line and an F 2 population, two genes were identified that mapped to the short arm of chromosome 1S sh , designated as Sr-1644-1Sh, and the long arm of chromosome 5S sh , designated as Sr-1644-5Sh. The gene Sr-1644-1Sh confers a high level of resistance to race TTKSK (a member of the Ug99 race group), while the gene Sr-1644-5Sh conditions strong resistance to TRTTF, another widely virulent race found in Yemen. Additionally, GWAS was conducted on 125 diverse Ae. sharonensis accessions for stem rust resistance. The gene Sr-1644-1Sh was detected by GWAS, while Sr-1644-5Sh was not detected, indicating that the effectiveness of GWAS might be affected by marker density, population structure, low allele frequency and other factors.

  15. A Gene-Based Analysis of Acoustic Startle Latency

    Science.gov (United States)

    Smith, Alicia K.; Jovanovic, Tanja; Kilaru, Varun; Lori, Adriana; Gensler, Lauren; Lee, Samuel S.; Norrholm, Seth Davin; Massa, Nicholas; Cuthbert, Bruce; Bradley, Bekh; Ressler, Kerry J.; Duncan, Erica

    2017-01-01

    Latency of the acoustic startle response is the time required from the presentation of startling auditory stimulus until the startle response is elicited and provides an index of neural processing speed. Latency is prolonged in subjects with schizophrenia compared to controls in some but not all studies and is 68–90% heritable in baseline startle trials. In order to determine the genetic association with latency as a potential inroad into genetically based vulnerability to psychosis, we conducted a gene-based study of latency followed by an independent replication study of significant gene findings with a single-nucleotide polymorphism (SNP)-based analysis of schizophrenia and control subjects. 313 subjects from an urban population of low socioeconomic status with mixed psychiatric diagnoses were included in the gene-based study. Startle testing was conducted using a Biopac M150 system according to our published methods. Genotyping was performed with the Omni-Quad 1M or the Omni Express BeadChip. The replication study was conducted on 154 schizophrenia subjects and 123 psychiatric controls. Genetic analyses were conducted with Illumina Human Omni1-Quad and OmniExpress BeadChips. Twenty-nine SNPs were selected from four genes that were significant in the gene-based analysis and also associated with startle and/or schizophrenia in the literature. Linear regressions on latency were conducted, controlling for age, race, and diagnosis as a dichotomous variable. In the gene-based study, 2,870 genes demonstrated the evidence of association after correction for multiple comparisons (false discovery rate < 0.05). Pathway analysis of these genes revealed enrichment for relevant biological processes including neural transmission (p = 0.0029), synaptic transmission (p = 0.0032), and neuronal development (p = 0.024). The subsequent SNP-based replication analysis revealed a strong association of onset latency with the SNP rs901561 on the neuregulin gene (NRG1

  16. Discovery and characterization of novel vascular and hematopoietic genes downstream of etsrp in zebrafish.

    Directory of Open Access Journals (Sweden)

    Gustavo A Gomez

    Full Text Available The transcription factor Etsrp is required for vasculogenesis and primitive myelopoiesis in zebrafish. When ectopically expressed, etsrp is sufficient to induce the expression of many vascular and myeloid genes in zebrafish. The mammalian homolog of etsrp, ER71/Etv2, is also essential for vascular and hematopoietic development. To identify genes downstream of etsrp, gain-of-function experiments were performed for etsrp in zebrafish embryos followed by transcription profile analysis by microarray. Subsequent in vivo expression studies resulted in the identification of fourteen genes with blood and/or vascular expression, six of these being completely novel. Regulation of these genes by etsrp was confirmed by ectopic induction in etsrp overexpressing embryos and decreased expression in etsrp deficient embryos. Additional functional analysis of two newly discovered genes, hapln1b and sh3gl3, demonstrates their importance in embryonic vascular development. The results described here identify a group of genes downstream of etsrp likely to be critical for vascular and/or myeloid development.

  17. [Unexpected discovery of a fetus with DMD gene deletion using single nucleotide polymorphism array].

    Science.gov (United States)

    Lin, Shaobin; Zhou, Yu; Zhou, Bingyi; Gu, Heng

    2017-08-10

    To investigate the value of single nucleotide polymorphism array (SNP array) for the identification of de novo mutations in the DMD gene among fetuses. G-banded karyotyping and SNP array were performed on a fetus with intrauterine growth restriction but without family history of Duchenne/Becker muscular dystrophy (DMD/BMD). Multiplex ligation-dependent probe amplification (MLPA) was subsequently applied on amniocytes and maternal peripheral blood sample to detect DMD gene deletion/duplication mutations. Karyotyping of amniocytes showed a normal 46, XY karyotype. SNP array on amniocytes detected a 116 kb deletion (chrX: 32 455 741-32 571 504) at Xp21.1 with breakpoints at introns 16 and 30 respectively, encompassing exons 17-29 of the DMD gene. In addition, MLPA analysis of the DMD gene on amniocytes confirmed the deletion of exons 17 to 29 identified by SNP array. However, no deletion/duplication mutation was detected by MLPA in the mother. The de novo deletion of exons 17 to 29 of the DMD gene detected in the fetus may result in BMD or DMD. SNP array can improve the efficiency for detecting genomic disorders in fetuses with unidentified pathogenic genes, negative family history and nonspecific phenotypes.

  18. In silico tools used for compound selection during target-based drug discovery and development.

    Science.gov (United States)

    Caldwell, Gary W

    2015-01-01

    The target-based drug discovery process, including target selection, screening, hit-to-lead (H2L) and lead optimization stage gates, is the most common approach used in drug development. The full integration of in vitro and/or in vivo data with in silico tools across the entire process would be beneficial to R&D productivity by developing effective selection criteria and drug-design optimization strategies. This review focuses on understanding the impact and extent in the past 5 years of in silico tools on the various stage gates of the target-based drug discovery approach. There are a large number of in silico tools available for establishing selection criteria and drug-design optimization strategies in the target-based approach. However, the inconsistent use of in vitro and/or in vivo data integrated with predictive in silico multiparameter models throughout the process is contributing to R&D productivity issues. In particular, the lack of reliable in silico tools at the H2L stage gate is contributing to the suboptimal selection of viable lead compounds. It is suggested that further development of in silico multiparameter models and organizing biologists, medicinal and computational chemists into one team with a single accountable objective to expand the utilization of in silico tools in all phases of drug discovery would improve R&D productivity.

  19. Structure-Based Drug Discovery for Prion Disease Using a Novel Binding Simulation

    Directory of Open Access Journals (Sweden)

    Daisuke Ishibashi

    2016-07-01

    Full Text Available The accumulation of abnormal prion protein (PrPSc converted from the normal cellular isoform of PrP (PrPC is assumed to induce pathogenesis in prion diseases. Therefore, drug discovery studies for these diseases have focused on the protein conversion process. We used a structure-based drug discovery algorithm (termed Nagasaki University Docking Engine: NUDE that ran on an intensive supercomputer with a graphic-processing unit to identify several compounds with anti-prion effects. Among the candidates showing a high-binding score, the compounds exhibited direct interaction with recombinant PrP in vitro, and drastically reduced PrPSc and protein-aggresomes in the prion-infected cells. The fragment molecular orbital calculation showed that the van der Waals interaction played a key role in PrPC binding as the intermolecular interaction mode. Furthermore, PrPSc accumulation and microgliosis were significantly reduced in the brains of treated mice, suggesting that the drug candidates provided protection from prion disease, although further in vivo tests are needed to confirm these findings. This NUDE-based structure-based drug discovery for normal protein structures is likely useful for the development of drugs to treat other conformational disorders, such as Alzheimer's disease.

  20. Complementary Approaches to Existing Target Based Drug Discovery for Identifying Novel Drug Targets

    Directory of Open Access Journals (Sweden)

    Suhas Vasaikar

    2016-11-01

    Full Text Available In the past decade, it was observed that the relationship between the emerging New Molecular Entities and the quantum of R&D investment has not been favorable. There might be numerous reasons but few studies stress the introduction of target based drug discovery approach as one of the factors. Although a number of drugs have been developed with an emphasis on a single protein target, yet identification of valid target is complex. The approach focuses on an in vitro single target, which overlooks the complexity of cell and makes process of validation drug targets uncertain. Thus, it is imperative to search for alternatives rather than looking at success stories of target-based drug discovery. It would be beneficial if the drugs were developed to target multiple components. New approaches like reverse engineering and translational research need to take into account both system and target-based approach. This review evaluates the strengths and limitations of known drug discovery approaches and proposes alternative approaches for increasing efficiency against treatment.

  1. Serious limitations of the QTL/Microarray approach for QTL gene discovery

    Directory of Open Access Journals (Sweden)

    Warden Craig H

    2010-07-01

    Full Text Available Abstract Background It has been proposed that the use of gene expression microarrays in nonrecombinant parental or congenic strains can accelerate the process of isolating individual genes underlying quantitative trait loci (QTL. However, the effectiveness of this approach has not been assessed. Results Thirty-seven studies that have implemented the QTL/microarray approach in rodents were reviewed. About 30% of studies showed enrichment for QTL candidates, mostly in comparisons between congenic and background strains. Three studies led to the identification of an underlying QTL gene. To complement the literature results, a microarray experiment was performed using three mouse congenic strains isolating the effects of at least 25 biometric QTL. Results show that genes in the congenic donor regions were preferentially selected. However, within donor regions, the distribution of differentially expressed genes was homogeneous once gene density was accounted for. Genes within identical-by-descent (IBD regions were less likely to be differentially expressed in chromosome 2, but not in chromosomes 11 and 17. Furthermore, expression of QTL regulated in cis (cis eQTL showed higher expression in the background genotype, which was partially explained by the presence of single nucleotide polymorphisms (SNP. Conclusions The literature shows limited successes from the QTL/microarray approach to identify QTL genes. Our own results from microarray profiling of three congenic strains revealed a strong tendency to select cis-eQTL over trans-eQTL. IBD regions had little effect on rate of differential expression, and we provide several reasons why IBD should not be used to discard eQTL candidates. In addition, mismatch probes produced false cis-eQTL that could not be completely removed with the current strains genotypes and low probe density microarrays. The reviewed studies did not account for lack of coverage from the platforms used and therefore removed genes

  2. Discovery of Phytophthora infestans Genes Expressed in Planta through Mining of cDNA Libraries

    Science.gov (United States)

    Chaves, Diego; Pinzón, Andrés; Grajales, Alejandro; Rojas, Alejandro; Mutis, Gabriel; Cárdenas, Martha; Burbano, Daniel; Jiménez, Pedro; Bernal, Adriana; Restrepo, Silvia

    2010-01-01

    Background Phytophthora infestans (Mont.) de Bary causes late blight of potato and tomato, and has a broad host range within the Solanaceae family. Most studies of the Phytophthora – Solanum pathosystem have focused on gene expression in the host and have not analyzed pathogen gene expression in planta. Methodology/Principal Findings We describe in detail an in silico approach to mine ESTs from inoculated host plants deposited in a database in order to identify particular pathogen sequences associated with disease. We identified candidate effector genes through mining of 22,795 ESTs corresponding to P. infestans cDNA libraries in compatible and incompatible interactions with hosts from the Solanaceae family. Conclusions/Significance We annotated genes of P. infestans expressed in planta associated with late blight using different approaches and assigned putative functions to 373 out of the 501 sequences found in the P. infestans genome draft, including putative secreted proteins, domains associated with pathogenicity and poorly characterized proteins ideal for further experimental studies. Our study provides a methodology for analyzing cDNA libraries and provides an understanding of the plant – oomycete pathosystems that is independent of the host, condition, or type of sample by identifying genes of the pathogen expressed in planta. PMID:20352100

  3. Discovery of Phytophthora infestans genes expressed in planta through mining of cDNA libraries.

    Directory of Open Access Journals (Sweden)

    Roberto Sierra

    Full Text Available BACKGROUND: Phytophthora infestans (Mont. de Bary causes late blight of potato and tomato, and has a broad host range within the Solanaceae family. Most studies of the Phytophthora--Solanum pathosystem have focused on gene expression in the host and have not analyzed pathogen gene expression in planta. METHODOLOGY/PRINCIPAL FINDINGS: We describe in detail an in silico approach to mine ESTs from inoculated host plants deposited in a database in order to identify particular pathogen sequences associated with disease. We identified candidate effector genes through mining of 22,795 ESTs corresponding to P. infestans cDNA libraries in compatible and incompatible interactions with hosts from the Solanaceae family. CONCLUSIONS/SIGNIFICANCE: We annotated genes of P. infestans expressed in planta associated with late blight using different approaches and assigned putative functions to 373 out of the 501 sequences found in the P. infestans genome draft, including putative secreted proteins, domains associated with pathogenicity and poorly characterized proteins ideal for further experimental studies. Our study provides a methodology for analyzing cDNA libraries and provides an understanding of the plant--oomycete pathosystems that is independent of the host, condition, or type of sample by identifying genes of the pathogen expressed in planta.

  4. Gene-Expression-Guided Selection of Candidate Loci and Molecular Phenotype Analyses Enhance Genetic Discovery in Systemic Lupus Erythematosus

    Directory of Open Access Journals (Sweden)

    Yelena Koldobskaya

    2012-01-01

    Full Text Available Systemic lupus erythematosus (SLE is a highly heterogeneous autoimmune disorder characterized by differences in autoantibody profiles, serum cytokines, and clinical manifestations. We have previously conducted a case-case genome-wide association study (GWAS of SLE patients to detect associations with autoantibody profile and serum interferon alpha (IFN-α. In this study, we used public gene expression data sets to rationally select additional single nucleotide polymorphisms (SNPs for validation. The top 200 GWAS SNPs were searched in a database which compares genome-wide expression data to genome-wide SNP genotype data in HapMap cell lines. SNPs were chosen for validation if they were associated with differential expression of 15 or more genes at a significance of P<9×10−5. This resulted in 11 SNPs which were genotyped in 453 SLE patients and 418 matched controls. Three SNPs were associated with SLE-associated autoantibodies, and one of these SNPs was also associated with serum IFN-α (P<4.5×10−3 for all. One additional SNP was associated exclusively with serum IFN-α. Case-control analysis was insensitive to these molecular subphenotype associations. This study illustrates the use of gene expression data to rationally select candidate loci in autoimmune disease, and the utility of stratification by molecular phenotypes in the discovery of additional genetic associations in SLE.

  5. Gene-expression-guided selection of candidate loci and molecular phenotype analyses enhance genetic discovery in systemic lupus erythematosus.

    Science.gov (United States)

    Koldobskaya, Yelena; Ko, Kichul; Kumar, Akaash A; Agik, Sandra; Arrington, Jasmine; Kariuki, Silvia N; Franek, Beverly S; Kumabe, Marissa; Utset, Tammy O; Jolly, Meenakshi; Skol, Andrew D; Niewold, Timothy B

    2012-01-01

    Systemic lupus erythematosus (SLE) is a highly heterogeneous autoimmune disorder characterized by differences in autoantibody profiles, serum cytokines, and clinical manifestations. We have previously conducted a case-case genome-wide association study (GWAS) of SLE patients to detect associations with autoantibody profile and serum interferon alpha (IFN-α). In this study, we used public gene expression data sets to rationally select additional single nucleotide polymorphisms (SNPs) for validation. The top 200 GWAS SNPs were searched in a database which compares genome-wide expression data to genome-wide SNP genotype data in HapMap cell lines. SNPs were chosen for validation if they were associated with differential expression of 15 or more genes at a significance of P < 9 × 10(-5). This resulted in 11 SNPs which were genotyped in 453 SLE patients and 418 matched controls. Three SNPs were associated with SLE-associated autoantibodies, and one of these SNPs was also associated with serum IFN-α (P < 4.5 × 10(-3) for all). One additional SNP was associated exclusively with serum IFN-α. Case-control analysis was insensitive to these molecular subphenotype associations. This study illustrates the use of gene expression data to rationally select candidate loci in autoimmune disease, and the utility of stratification by molecular phenotypes in the discovery of additional genetic associations in SLE.

  6. Transcriptome analysis and discovery of genes involved in immune pathways from hepatopancreas of microbial challenged mitten crab Eriocheir sinensis.

    Directory of Open Access Journals (Sweden)

    Xihong Li

    Full Text Available BACKGROUND: The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq technology provides a powerful and efficient method for transcript analysis and immune gene discovery. METHODS/PRINCIPAL FINDINGS: A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 10(8 cfu·mL(-1 was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr database. For function classification and pathway assignment, 18,734 (36.00% unigenes were categorized to three Gene Ontology (GO categories, 12,243 (23.51% were classified to 25 Clusters of Orthologous Groups (COG, and 8,983 (17.25% were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. CONCLUSIONS/SIGNIFICANCE: This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab.

  7. Gene/QTL discovery for Anthracnose in common bean (Phaseolus vulgaris L.) from North-western Himalayas.

    Science.gov (United States)

    Choudhary, Neeraj; Bawa, Vanya; Paliwal, Rajneesh; Singh, Bikram; Bhat, Mohd Ashraf; Mir, Javid Iqbal; Gupta, Moni; Sofi, Parvaze A; Thudi, Mahendar; Varshney, Rajeev K; Mir, Reyazul Rouf

    2018-01-01

    Common bean (Phaseolus vulgaris L.) is one of the most important grain legume crops in the world. The beans grown in north-western Himalayas possess huge diversity for seed color, shape and size but are mostly susceptible to Anthracnose disease caused by seed born fungus Colletotrichum lindemuthianum. Dozens of QTLs/genes have been already identified for this disease in common bean world-wide. However, this is the first report of gene/QTL discovery for Anthracnose using bean germplasm from north-western Himalayas of state Jammu & Kashmir, India. A core set of 96 bean lines comprising 54 indigenous local landraces from 11 hot-spots and 42 exotic lines from 10 different countries were phenotyped at two locations (SKUAST-Jammu and Bhaderwah, Jammu) for Anthracnose resistance. The core set was also genotyped with genome-wide (91) random and trait linked SSR markers. The study of marker-trait associations (MTAs) led to the identification of 10 QTLs/genes for Anthracnose resistance. Among the 10 QTLs/genes identified, two MTAs are stable (BM45 & BM211), two MTAs (PVctt1 & BM211) are major explaining more than 20% phenotypic variation for Anthracnose and one MTA (BM211) is both stable and major. Six (06) genomic regions are reported for the first time, while as four (04) genomic regions validated the already known QTL/gene regions/clusters for Anthracnose. The major, stable and validated markers reported during the present study associated with Anthracnose resistance will prove useful in common bean molecular breeding programs aimed at enhancing Anthracnose resistance of local bean landraces grown in north-western Himalayas of state Jammu and Kashmir.

  8. A probabilistic approach for automated discovery of perturbed genes using expression data from microarray or RNA-Seq.

    Science.gov (United States)

    Sundaramurthy, Gopinath; Eghbalnia, Hamid R

    2015-12-01

    In complex diseases, alterations of multiple molecular and cellular components in response to perturbations are indicative of disease physiology. While expression level of genes from high-throughput analysis can vary among patients, the common path among disease progression suggests that the underlying cellular sub-processes involving associated genes follow similar fates. Motivated by the interconnected nature of sub-processes, we have developed an automated methodology that combines ideas from biological networks, statistical models, and game theory, to probe connected cellular processes. The core concept in our approach uses probability of change (POC) to indicate the probability that a gene's expression level has changed between two conditions. POC facilitates the definition of change at the neighborhood, pathway, and network levels and enables evaluation of the influence of diseases on the expression. The 'connected' disease-related genes (DRG) identified display coherent and concomitant differential expression levels along paths. RNA-Seq and microarray breast cancer subtyping expression data sets were used to identify DRG between subtypes. A machine-learning algorithm was trained for subtype discrimination using the DRG, and the training yielded a set of biomarkers. The discriminative power of the biomarkers was tested using an unseen data set. Biomarkers identified overlaps with disease-specific identified genes, and we were able to classify disease subtypes with 100% and 80% agreement with PAM50, for microarray and RNA-Seq data set respectively. We present an automated probabilistic approach that offers unbiased and reproducible results, thus complementing existing methods in DRG and biomarker discovery for complex diseases. Copyright © 2015. Published by Elsevier Ltd.

  9. The CHRNA5-A3-B4 Gene Cluster and Smoking: From Discovery to Therapeutics.

    Science.gov (United States)

    Lassi, Glenda; Taylor, Amy E; Timpson, Nicholas J; Kenny, Paul J; Mather, Robert J; Eisen, Tim; Munafò, Marcus R

    2016-12-01

    Genome-wide association studies (GWASs) have identified associations between the CHRNA5-CHRNA3-CHRNB4 gene cluster and smoking heaviness and nicotine dependence. Studies in rodents have described the anatomical localisation and function of the nicotinic acetylcholine receptors (nAChRs) formed by the subunits encoded by this gene cluster. Further investigations that complemented these studies highlighted the variability of individuals' smoking behaviours and their ability to adjust nicotine intake. GWASs of smoking-related health outcomes have also identified this signal in the CHRNA5-CHRNA3-CHRNB4 gene cluster. This insight underpins approaches to strengthen causal inference in observational data. Combining genetic and mechanistic studies of nicotine dependence and smoking heaviness may reveal novel targets for medication development. Validated targets can inform genetic therapeutic interventions for smoking cessation and tobacco-related diseases. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  10. Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data

    Directory of Open Access Journals (Sweden)

    Gruber Stephen B

    2005-02-01

    Full Text Available Abstract Background A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tightly controlled calibration data sets as the basis for comparison. Here we compare performances for seven processing methods using two data sets originally collected for disease profiling studies. An emphasis is placed on understanding sensitivity for detecting differentially expressed genes in terms of two key statistical determinants: test statistic variability for non-differentially expressed genes, and test statistic size for truly differentially expressed genes. Results In the two data sets considered here, up to seven-fold variation across the processing methods was found in the number of genes detected at a given false discovery rate (FDR. The best performing methods called up to 90% of the same genes differentially expressed, had less variable test statistics under randomization, and had a greater number of large test statistics in the experimental data. Poor performance of one method was directly tied to a tendency to produce highly variable test statistic values under randomization. Based on an overall measure of performance, two of the seven methods (Dchip and a trimmed mean approach are superior in the two data sets considered here. Two other methods (MAS5 and GCRMA-EB are inferior, while results for the other three methods are mixed. Conclusions Choice of processing method has a major impact on differential expression analysis of microarray data. Previously reported performance analyses using tightly controlled calibration data sets are not highly consistent with results reported here using data from human tissue samples. Performance of array processing methods in disease profiling and other realistic biological studies should be

  11. Discovery and characterization of two new stem rust resistance genes in Aegilops sharonensis

    OpenAIRE

    Yu, Guotai; Champouret, Nicolas; Steuernagel, Burkhard; Olivera, Pablo D.; Simmons, Jamie; Williams, Cole; Johnson, Ryan; Moscou, Matthew J.; Hern?ndez-Pinz?n, Inmaculada; Green, Phon; Sela, Hanan; Millet, Eitan; Jones, Jonathan D. G.; Ward, Eric R.; Steffenson, Brian J.

    2017-01-01

    Key message We identified two novel wheat stem rust resistance genes, Sr-1644-1Sh and Sr-1644-5Sh in Aegilops sharonensis that are effective against widely virulent African races of the wheat stem rust pathogen. Abstract Stem rust is one of the most important diseases of wheat in the world. When single stem rust resistance (Sr) genes are deployed in wheat, they are often rapidly overcome by the pathogen. To this end, we initiated a search for novel sources of resistance in diverse wheat relat...

  12. Contributions of computational chemistry and biophysical techniques to fragment-based drug discovery.

    Science.gov (United States)

    Gozalbes, Rafael; Carbajo, Rodrigo J; Pineda-Lucena, Antonio

    2010-01-01

    In the last decade, fragment-based drug discovery (FBDD) has evolved from a novel approach in the search of new hits to a valuable alternative to the high-throughput screening (HTS) campaigns of many pharmaceutical companies. The increasing relevance of FBDD in the drug discovery universe has been concomitant with an implementation of the biophysical techniques used for the detection of weak inhibitors, e.g. NMR, X-ray crystallography or surface plasmon resonance (SPR). At the same time, computational approaches have also been progressively incorporated into the FBDD process and nowadays several computational tools are available. These stretch from the filtering of huge chemical databases in order to build fragment-focused libraries comprising compounds with adequate physicochemical properties, to more evolved models based on different in silico methods such as docking, pharmacophore modelling, QSAR and virtual screening. In this paper we will review the parallel evolution and complementarities of biophysical techniques and computational methods, providing some representative examples of drug discovery success stories by using FBDD.

  13. Research on hotspot discovery in internet public opinions based on improved K-means.

    Science.gov (United States)

    Wang, Gensheng

    2013-01-01

    How to discover hotspot in the Internet public opinions effectively is a hot research field for the researchers related which plays a key role for governments and corporations to find useful information from mass data in the Internet. An improved K-means algorithm for hotspot discovery in internet public opinions is presented based on the analysis of existing defects and calculation principle of original K-means algorithm. First, some new methods are designed to preprocess website texts, select and express the characteristics of website texts, and define the similarity between two website texts, respectively. Second, clustering principle and the method of initial classification centers selection are analyzed and improved in order to overcome the limitations of original K-means algorithm. Finally, the experimental results verify that the improved algorithm can improve the clustering stability and classification accuracy of hotspot discovery in internet public opinions when used in practice.

  14. Enhancing service discovery using cat swarm optimisation based web service clustering

    Directory of Open Access Journals (Sweden)

    Sunaina Kotekar

    2016-09-01

    Full Text Available Web service discovery is a critical task in service oriented application development. Due to extensive proliferation in the number of available services, it is challenging to obtain all the relevant services available for a given task. For the retrieval of most relevant Web services, a user would have to use those service-specific terms that best describe and match the natural language documentation contained within a service description. This process can be time intensive, due to functional diversity of available services in a repository. Domain specific clustering of Web Services based on the similarities of their functionalities would greatly boost the ability of a Web service search engine to retrieve the most relevant service. In this paper, we propose a novel technique to cluster service documents into functionally similar service groups using the Cat Swarm Optimisation Algorithm. We present experimental results that show that the proposed technique was effective and enhanced the process of service discovery.

  15. Matrix- and tensor-based recommender systems for the discovery of currently unknown inorganic compounds

    Science.gov (United States)

    Seko, Atsuto; Hayashi, Hiroyuki; Kashima, Hisashi; Tanaka, Isao

    2018-01-01

    Chemically relevant compositions (CRCs) and atomic arrangements of inorganic compounds have been collected as inorganic crystal structure databases. Machine learning is a unique approach to search for currently unknown CRCs from vast candidates. Herein we propose matrix- and tensor-based recommender system approaches to predict currently unknown CRCs from database entries of CRCs. Firstly, the performance of the recommender system approaches to discover currently unknown CRCs is examined. A Tucker decomposition recommender system shows the best discovery rate of CRCs as the majority of the top 100 recommended ternary and quaternary compositions correspond to CRCs. Secondly, systematic density functional theory (DFT) calculations are performed to investigate the phase stability of the recommended compositions. The phase stability of the 27 compositions reveals that 23 currently unknown compounds are newly found to be stable. These results indicate that the recommender system has great potential to accelerate the discovery of new compounds.

  16. Discovery of new Syk inhibitors through structure-based virtual screening.

    Science.gov (United States)

    Huang, Yahui; Zhang, Youjun; Fan, Kexin; Dong, Guoqiang; Li, Bohua; Zhang, Wannian; Li, Jian; Sheng, Chunquan

    2017-04-15

    Spleen tyrosine kinase (Syk) is an attractive target for the discovery of new treatments for inflammatory and autoimmune disorders. Structure-based virtual screening was performed for identifying novel scaffolds of Syk inhibitors. A total of 16 hits were discovered in the enzyme assay and 8 compounds had an IC 50 value lower than 10μM. In particular, compound 11 (IC 50 =3.2μM) was active in the cellular Syk assay and could inhibit lymphocytes proliferation in a dose-dependent manner, which could be used as a good starting point for the discovery of new class of Syk inhibitors. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Binding thermodynamics discriminates fragments from druglike compounds: a thermodynamic description of fragment-based drug discovery.

    Science.gov (United States)

    Williams, Glyn; Ferenczy, György G; Ulander, Johan; Keserű, György M

    2017-04-01

    Small is beautiful - reducing the size and complexity of chemical starting points for drug design allows better sampling of chemical space, reveals the most energetically important interactions within protein-binding sites and can lead to improvements in the physicochemical properties of the final drug. The impact of fragment-based drug discovery (FBDD) on recent drug discovery projects and our improved knowledge of the structural and thermodynamic details of ligand binding has prompted us to explore the relationships between ligand-binding thermodynamics and FBDD. Information on binding thermodynamics can give insights into the contributions to protein-ligand interactions and could therefore be used to prioritise compounds with a high degree of specificity in forming key interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    Science.gov (United States)

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  19. Discovery and functional assessment of gene variants in the vascular endothelial growth factor pathway.

    Science.gov (United States)

    Paré-Brunet, Laia; Glubb, Dylan; Evans, Patrick; Berenguer-Llergo, Antoni; Etheridge, Amy S; Skol, Andrew D; Di Rienzo, Anna; Duan, Shiwei; Gamazon, Eric R; Innocenti, Federico

    2014-02-01

    Angiogenesis is a host-mediated mechanism in disease pathophysiology. The vascular endothelial growth factor (VEGF) pathway is a major determinant of angiogenesis, and a comprehensive annotation of the functional variation in this pathway is essential to understand the genetic basis of angiogenesis-related diseases. We assessed the allelic heterogeneity of gene expression, population specificity of cis expression quantitative trait loci (eQTLs), and eQTL function in luciferase assays in CEU and Yoruba people of Ibadan, Nigeria (YRI) HapMap lymphoblastoid cell lines in 23 resequenced genes. Among 356 cis-eQTLs, 155 and 174 were unique to CEU and YRI, respectively, and 27 were shared between CEU and YRI. Two cis-eQTLs provided mechanistic evidence for two genome-wide association study findings. Five eQTLs were tested for function in luciferase assays and the effect of two KRAS variants was concordant with the eQTL effect. Two eQTLs found in each of PRKCE, PIK3C2A, and MAP2K6 could predict 44%, 37%, and 45% of the variance in gene expression, respectively. This is the first analysis focusing on the pattern of functional genetic variation of the VEGF pathway genes in CEU and YRI populations and providing mechanistic evidence for genetic association studies of diseases for which angiogenesis plays a pathophysiologic role. © 2013 WILEY PERIODICALS, INC.

  20. Molecular mapping of soybean rust (Phakopsora pachyrhizi) resistance genes: discovery of a novel locus and alleles.

    Science.gov (United States)

    Garcia, Alexandre; Calvo, Eberson Sanches; de Souza Kiihl, Romeu Afonso; Harada, Arlindo; Hiromoto, Dario Minoru; Vieira, Luiz Gonzaga Esteves

    2008-08-01

    Soybean production in South and North America has recently been threatened by the widespread dissemination of soybean rust (SBR) caused by the fungus Phakopsora pachyrhizi. Currently, chemical spray containing fungicides is the only effective method to control the disease. This strategy increases production costs and exposes the environment to higher levels of fungicides. As a first step towards the development of SBR resistant cultivars, we studied the genetic basis of SBR resistance in five F2 populations derived from crossing the Brazilian-adapted susceptible cultivar CD 208 to each of five different plant introductions (PI 200487, PI 200526, PI 230970, PI 459025, PI 471904) carrying SBR-resistant genes (Rpp). Molecular mapping of SBR-resistance genes was performed in three of these PIs (PI 459025, PI 200526, PI 471904), and also in two other PIs (PI 200456 and 224270). The strategy mapped two genes present in PI 230970 and PI 459025, the original sources of Rpp2 and Rpp4, to linkage groups (LG) J and G, respectively. A new SBR resistance locus, rpp5 was mapped in the LG-N. Together, the genetic and molecular analysis suggested multiple alleles or closely linked genes that govern SBR resistance in soybean.

  1. Genome-Wide Discovery of Genes Required for Capsule Production by UropathogenicEscherichia coli.

    Science.gov (United States)

    Goh, Kelvin G K; Phan, Minh-Duy; Forde, Brian M; Chong, Teik Min; Yin, Wai-Fong; Chan, Kok-Gan; Ulett, Glen C; Sweet, Matthew J; Beatson, Scott A; Schembri, Mark A

    2017-10-24

    Uropathogenic Escherichia coli (UPEC) is a major cause of urinary tract and bloodstream infections and possesses an array of virulence factors for colonization, survival, and persistence. One such factor is the polysaccharide K capsule. Among the different K capsule types, the K1 serotype is strongly associated with UPEC infection. In this study, we completely sequenced the K1 UPEC urosepsis strain PA45B and employed a novel combination of a lytic K1 capsule-specific phage, saturated Tn 5 transposon mutagenesis, and high-throughput transposon-directed insertion site sequencing (TraDIS) to identify the complement of genes required for capsule production. Our analysis identified known genes involved in capsule biosynthesis, as well as two additional regulatory genes ( mprA and lrhA ) that we characterized at the molecular level. Mutation of mprA resulted in protection against K1 phage-mediated killing, a phenotype restored by complementation. We also identified a significantly increased unidirectional Tn 5 insertion frequency upstream of the lrhA gene and showed that strong expression of LrhA induced by a constitutive Pcl promoter led to loss of capsule production. Further analysis revealed loss of MprA or overexpression of LrhA affected the transcription of capsule biosynthesis genes in PA45B and increased sensitivity to killing in whole blood. Similar phenotypes were also observed in UPEC strains UTI89 (K1) and CFT073 (K2), demonstrating that the effects were neither strain nor capsule type specific. Overall, this study defined the genome of a UPEC urosepsis isolate and identified and characterized two new regulatory factors that affect UPEC capsule production. IMPORTANCE Urinary tract infections (UTIs) are among the most common bacterial infections in humans and are primarily caused by uropathogenic Escherichia coli (UPEC). Many UPEC strains express a polysaccharide K capsule that provides protection against host innate immune factors and contributes to survival

  2. Gene expression and epigenetic discovery screen reveal methylation of SFRP2 in prostate cancer.

    LENUS (Irish Health Repository)

    Perry, Antoinette S

    2013-04-15

    Aberrant activation of Wnts is common in human cancers, including prostate. Hypermethylation associated transcriptional silencing of Wnt antagonist genes SFRPs (Secreted Frizzled-Related Proteins) is a frequent oncogenic event. The significance of this is not known in prostate cancer. The objectives of our study were to (i) profile Wnt signaling related gene expression and (ii) investigate methylation of Wnt antagonist genes in prostate cancer. Using TaqMan Low Density Arrays, we identified 15 Wnt signaling related genes with significantly altered expression in prostate cancer; the majority of which were upregulated in tumors. Notably, histologically benign tissue from men with prostate cancer appeared more similar to tumor (r = 0.76) than to benign prostatic hyperplasia (BPH; r = 0.57, p < 0.001). Overall, the expression profile was highly similar between tumors of high (≥ 7) and low (≤ 6) Gleason scores. Pharmacological demethylation of PC-3 cells with 5-Aza-CdR reactivated 39 genes (≥ 2-fold); 40% of which inhibit Wnt signaling. Methylation frequencies in prostate cancer were 10% (2\\/20) (SFRP1), 64.86% (48\\/74) (SFRP2), 0% (0\\/20) (SFRP4) and 60% (12\\/20) (SFRP5). SFRP2 methylation was detected at significantly lower frequencies in high-grade prostatic intraepithelial neoplasia (HGPIN; 30%, (6\\/20), p = 0.0096), tumor adjacent benign areas (8.82%, (7\\/69), p < 0.0001) and BPH (11.43% (4\\/35), p < 0.0001). The quantitative level of SFRP2 methylation (normalized index of methylation) was also significantly higher in tumors (116) than in the other samples (HGPIN = 7.45, HB = 0.47, and BPH = 0.12). We show that SFRP2 hypermethylation is a common event in prostate cancer. SFRP2 methylation in combination with other epigenetic markers may be a useful biomarker of prostate cancer.

  3. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Directory of Open Access Journals (Sweden)

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  4. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Science.gov (United States)

    Polato, Nicholas R; Vera, J Cristobal; Baums, Iliana B

    2011-01-01

    Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000). The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite considerable exposure to genotoxic stress over long life

  5. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  6. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Science.gov (United States)

    Fruzangohar, Mario; Ebrahimie, Esmaeil; Ogunniyi, Abiodun D; Mahdi, Layla K; Paton, James C; Adelson, David L

    2013-01-01

    The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO), which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria) from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s) of infection. It can also aid in the discovery of genes associated with specific function(s) for investigation as a novel vaccine or therapeutic targets. http://turing.ersa.edu.au/BacteriaGO.

  7. Semantic MEDLINE for discovery browsing: using semantic predications and the literature-based discovery paradigm to elucidate a mechanism for the obesity paradox.

    Science.gov (United States)

    Cairelli, Michael J; Miller, Christopher M; Fiszman, Marcelo; Workman, T Elizabeth; Rindflesch, Thomas C

    2013-01-01

    Applying the principles of literature-based discovery (LBD), we elucidate the paradox that obesity is beneficial in critical care despite contributing to disease generally. Our approach enhances a previous extension to LBD, called "discovery browsing," and is implemented using Semantic MEDLINE, which summarizes the results of a PubMed search into an interactive graph of semantic predications. The methodology allows a user to construct argumentation underpinning an answer to a biomedical question by engaging the user in an iterative process between system output and user knowledge. Components of the Semantic MEDLINE output graph identified as "interesting" by the user both contribute to subsequent searches and are constructed into a logical chain of relationships constituting an explanatory network in answer to the initial question. Based on this methodology we suggest that phthalates leached from plastic in critical care interventions activate PPAR gamma, which is anti-inflammatory and abundant in obese patients.

  8. Sports Stars: Analyzing the Performance of Astronomers at Visualization-based Discovery

    Science.gov (United States)

    Fluke, C. J.; Parrington, L.; Hegarty, S.; MacMahon, C.; Morgan, S.; Hassan, A. H.; Kilborn, V. A.

    2017-05-01

    In this data-rich era of astronomy, there is a growing reliance on automated techniques to discover new knowledge. The role of the astronomer may change from being a discoverer to being a confirmer. But what do astronomers actually look at when they distinguish between “sources” and “noise?” What are the differences between novice and expert astronomers when it comes to visual-based discovery? Can we identify elite talent or coach astronomers to maximize their potential for discovery? By looking to the field of sports performance analysis, we consider an established, domain-wide approach, where the expertise of the viewer (i.e., a member of the coaching team) plays a crucial role in identifying and determining the subtle features of gameplay that provide a winning advantage. As an initial case study, we investigate whether the SportsCode performance analysis software can be used to understand and document how an experienced Hi astronomer makes discoveries in spectral data cubes. We find that the process of timeline-based coding can be applied to spectral cube data by mapping spectral channels to frames within a movie. SportsCode provides a range of easy to use methods for annotation, including feature-based codes and labels, text annotations associated with codes, and image-based drawing. The outputs, including instance movies that are uniquely associated with coded events, provide the basis for a training program or team-based analysis that could be used in unison with discipline specific analysis software. In this coordinated approach to visualization and analysis, SportsCode can act as a visual notebook, recording the insight and decisions in partnership with established analysis methods. Alternatively, in situ annotation and coding of features would be a valuable addition to existing and future visualization and analysis packages.

  9. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications.

    Science.gov (United States)

    Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

    2014-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu.

  10. NASA's GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain V.; Tran, Peter B.

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate 'open science' biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics ('omics') data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  11. NASAs GeneLab Phase II: Federated Search and Data Discovery

    Science.gov (United States)

    Berrios, Daniel C.; Costes, Sylvain; Tran, Peter

    2017-01-01

    GeneLab is currently being developed by NASA to accelerate open science biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics (omics) data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics.

  12. Gene discovery for improvement of kernel quality-related traits in maize

    Directory of Open Access Journals (Sweden)

    Motto M.

    2010-01-01

    Full Text Available Developing maize plants with improved kernel quality traits involves the ability to use existing genetic variation and to identify and manipulate commercially important genes. This will open avenues for designing novel variation in grain composition and will provide the basis for the development of the next generation of specialty maize. This paper provides an overview of current knowledge on the identification and exploitation of genes affecting the composition, development, and structure of the maize kernel with particular emphasis on pathways relevant to endosperm growth and development, differentiation of starch-filled cells, and biosynthesis of starches, storage proteins, lipids, and carotenoids. The potential that the new technologies of cell and molecular biology will provide for the creation of new variation in the future are also indicated and discussed.

  13. Omics-based natural product discovery and the lexicon of genome mining.

    Science.gov (United States)

    Machado, Henrique; Tuttle, Robert N; Jensen, Paul R

    2017-10-01

    Genome sequencing and the application of omic techniques are driving many important advances in the field of microbial natural products research. Despite these gains, there remain aspects of the natural product discovery pipeline where our knowledge remains poor. These include the extent to which biosynthetic gene clusters are transcriptionally active in native microbes, the temporal dynamics of transcription, translation, and natural product assembly, as well as the relationships between small molecule production and detection. Here we touch on a number of these concepts in the context of continuing efforts to unlock the natural product potential revealed in genome sequence data and discuss nomenclatural issues that warrant consideration as the field moves forward. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Strategies for enhancing the effectiveness of metagenomic-based enzyme discovery in lignocellulytic microbial communities

    Energy Technology Data Exchange (ETDEWEB)

    DeAngelis, K.M.; Gladden, J.G.; Allgaier, M.; D' haeseleer, P.; Fortney, J.L.; Reddy, A.; Hugenholtz, P.; Singer, S.W.; Vander Gheynst, J.; Silver, W.L.; Simmons, B.; Hazen, T.C.

    2010-03-01

    Producing cellulosic biofuels from plant material has recently emerged as a key U.S. Department of Energy goal. For this technology to be commercially viable on a large scale, it is critical to make production cost efficient by streamlining both the deconstruction of lignocellulosic biomass and fuel production. Many natural ecosystems efficiently degrade lignocellulosic biomass and harbor enzymes that, when identified, could be used to increase the efficiency of commercial biomass deconstruction. However, ecosystems most likely to yield relevant enzymes, such as tropical rain forest soil in Puerto Rico, are often too complex for enzyme discovery using current metagenomic sequencing technologies. One potential strategy to overcome this problem is to selectively cultivate the microbial communities from these complex ecosystems on biomass under defined conditions, generating less complex biomass-degrading microbial populations. To test this premise, we cultivated microbes from Puerto Rican soil or green waste compost under precisely defined conditions in the presence dried ground switchgrass (Panicum virgatum L.) or lignin, respectively, as the sole carbon source. Phylogenetic profiling of the two feedstock-adapted communities using SSU rRNA gene amplicon pyrosequencing or phylogenetic microarray analysis revealed that the adapted communities were significantly simplified compared to the natural communities from which they were derived. Several members of the lignin-adapted and switchgrass-adapted consortia are related to organisms previously characterized as biomass degraders, while others were from less well-characterized phyla. The decrease in complexity of these communities make them good candidates for metagenomic sequencing and will likely enable the reconstruction of a greater number of full length genes, leading to the discovery of novel lignocellulose-degrading enzymes adapted to feedstocks and conditions of interest.

  15. Diversity of ribulose-1,5-bisphosphate carboxylase/oxygenase large-subunit genes in the MgCl2-dominated deep hypersaline anoxic basin discovery

    NARCIS (Netherlands)

    van der Wielen, PWJJ

    Partial sequences of the form I (cbbL) and form II (cbbM) of the ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) large subunit genes were obtained from the brine and interface of the MgCl2-dominated deep hypersaline anoxic basin Discovery. CbbL and cbbM genes were found in both brine and

  16. Immediate dissemination of student discoveries to a model organism database enhances classroom-based research experiences.

    Science.gov (United States)

    Wiley, Emily A; Stover, Nicholas A

    2014-01-01

    Use of inquiry-based research modules in the classroom has soared over recent years, largely in response to national calls for teaching that provides experience with scientific processes and methodologies. To increase the visibility of in-class studies among interested researchers and to strengthen their impact on student learning, we have extended the typical model of inquiry-based labs to include a means for targeted dissemination of student-generated discoveries. This initiative required: 1) creating a set of research-based lab activities with the potential to yield results that a particular scientific community would find useful and 2) developing a means for immediate sharing of student-generated results. Working toward these goals, we designed guides for course-based research aimed to fulfill the need for functional annotation of the Tetrahymena thermophila genome, and developed an interactive Web database that links directly to the official Tetrahymena Genome Database for immediate, targeted dissemination of student discoveries. This combination of research via the course modules and the opportunity for students to immediately "publish" their novel results on a Web database actively used by outside scientists culminated in a motivational tool that enhanced students' efforts to engage the scientific process and pursue additional research opportunities beyond the course.

  17. Fish Suppressors of Cytokine Signaling (SOCS): Gene Discovery, Modulation of Expression and Function

    Science.gov (United States)

    Wang, Tiehui; Gorgoglione, Bartolomeo; Maehr, Tanja; Holland, Jason W.; Vecino, Jose L. González; Wadsworth, Simon; Secombes, Christopher J.

    2011-01-01

    The intracellular suppressors of cytokine signaling (SOCS) family members, including CISH and SOCS1 to 7 in mammals, are important regulators of cytokine signaling pathways. So far, the orthologues of all the eight mammalian SOCS members have been identified in fish, with several of them having multiple copies. Whilst fish CISH, SOCS3, and SOCS5 paralogues are possibly the result of the fish-specific whole genome duplication event, gene duplication or lineage-specific genome duplication may also contribute to some paralogues, as with the three trout SOCS2s and three zebrafish SOCS5s. Fish SOCS genes are broadly expressed and also show species-specific expression patterns. They can be upregulated by cytokines, such as IFN-γ, TNF-α, IL-1β, IL-6, and IL-21, by immune stimulants such as LPS, poly I:C, and PMA, as well as by viral, bacterial, and parasitic infections in member- and species-dependent manners. Initial functional studies demonstrate conserved mechanisms of fish SOCS action via JAK/STAT pathways. PMID:22203897

  18. AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery.

    Science.gov (United States)

    Tsai, Yingssu; McPhillips, Scott E; González, Ana; McPhillips, Timothy M; Zinn, Daniel; Cohen, Aina E; Feese, Michael D; Bushnell, David; Tiefenbrunn, Theresa; Stout, C David; Ludaescher, Bertram; Hedman, Britt; Hodgson, Keith O; Soltis, S Michael

    2013-05-01

    AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data, performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully demonstrated. This workflow was run once on the same 96 samples that the group had examined manually and the workflow cycled successfully through all of the samples, collected data from the same samples that were selected manually and located the same peaks of unmodeled density in the resulting difference Fourier

  19. Cultivation of Hard-To-Culture Subsurface Mercury-Resistant Bacteria and Discovery of New merA Gene Sequences▿

    Science.gov (United States)

    Rasmussen, L. D.; Zawadsky, C.; Binnerup, S. J.; Øregaard, G.; Sørensen, S. J.; Kroer, N.

    2008-01-01

    Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments. PMID:18441111

  20. Novel Technology for Protein-Protein Interaction-based Targeted Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jung Me Hwang

    2011-12-01

    Full Text Available We have developed a simple but highly efficient in-cell protein-protein interaction (PPI discovery system based on the translocation properties of protein kinase C- and its C1a domain in live cells. This system allows the visual detection of trimeric and dimeric protein interactions including cytosolic, nuclear, and/or membrane proteins with their cognate ligands. In addition, this system can be used to identify pharmacological small compounds that inhibit specific PPIs. These properties make this PPI system an attractive tool for screening drug candidates and mapping the protein interactome.

  1. AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery

    International Nuclear Information System (INIS)

    Tsai, Yingssu; McPhillips, Scott E.; González, Ana; McPhillips, Timothy M.; Zinn, Daniel; Cohen, Aina E.; Feese, Michael D.; Bushnell, David; Tiefenbrunn, Theresa; Stout, C. David; Ludaescher, Bertram; Hedman, Britt; Hodgson, Keith O.; Soltis, S. Michael

    2013-01-01

    New software has been developed for automating the experimental and data-processing stages of fragment-based drug discovery at a macromolecular crystallography beamline. A new workflow-automation framework orchestrates beamline-control and data-analysis software while organizing results from multiple samples. AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data, performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully

  2. Upnp-Based Discovery And Management Of Hypervisors And Virtual Machines

    Directory of Open Access Journals (Sweden)

    Sławomir Zieliński

    2011-01-01

    Full Text Available The paper introduces a Universal Plug and Play based discovery and management toolkitthat facilitates collaboration between cloud infrastructure providers and users. The presentedtools construct a unified hierarchy of devices and their management-related services, thatrepresents the current deployment of users’ (virtual infrastructures in the provider’s (physicalinfrastructure as well as the management interfaces of respective devices. The hierarchycan be used to enhance the capabilities of the provider’s infrastructure management system.To maintain user independence, the set of management operations exposed by a particulardevice is always defined by the device owner (either the provider or user.

  3. Yeast homologous recombination-based promoter engineering for the activation of silent natural product biosynthetic gene clusters.

    Science.gov (United States)

    Montiel, Daniel; Kang, Hahk-Soo; Chang, Fang-Yuan; Charlop-Powers, Zachary; Brady, Sean F

    2015-07-21

    Large-scale sequencing of prokaryotic (meta)genomic DNA suggests that most bacterial natural product gene clusters are not expressed under common laboratory culture conditions. Silent gene clusters represent a promising resource for natural product discovery and the development of a new generation of therapeutics. Unfortunately, the characterization of molecules encoded by these clusters is hampered owing to our inability to express these gene clusters in the laboratory. To address this bottleneck, we have developed a promoter-engineering platform to transcriptionally activate silent gene clusters in a model heterologous host. Our approach uses yeast homologous recombination, an auxotrophy complementation-based yeast selection system and sequence orthogonal promoter cassettes to exchange all native promoters in silent gene clusters with constitutively active promoters. As part of this platform, we constructed and validated a set of bidirectional promoter cassettes consisting of orthogonal promoter sequences, Streptomyces ribosome binding sites, and yeast selectable marker genes. Using these tools we demonstrate the ability to simultaneously insert multiple promoter cassettes into a gene cluster, thereby expediting the reengineering process. We apply this method to model active and silent gene clusters (rebeccamycin and tetarimycin) and to the silent, cryptic pseudogene-containing, environmental DNA-derived Lzr gene cluster. Complete promoter refactoring and targeted gene exchange in this "dead" cluster led to the discovery of potent indolotryptoline antiproliferative agents, lazarimides A and B. This potentially scalable and cost-effective promoter reengineering platform should streamline the discovery of natural products from silent natural product biosynthetic gene clusters.

  4. Discovery of Metastatic Breast Cancer Suppressor Genes Using Functional Genome Analysis

    Science.gov (United States)

    2012-07-01

    al., 2008; Cheung,H.W., et al., 2011; Barbie ,D.A., et al., 2009]. To identify genes whose essentiality could be associated specifically with...Reference Barbie ,D.A., Tamayo,P., Boehm,J.S., Kim,S.Y., Moody,S.E., Dunn,I.F., Schinzel,A.C., Sandy,P., Meylan,E., Scholl,C., Frohling,S., Chan,E.M... Barbie ,D.A., Awad,T., Zhou,X., Nguyen,T., Piqani,B., Li,C., Golub,T.R., Meyerson,M., Hacohen,N., Hahn,W.C., Lander,E.S., Sabatini,D.M., and Root

  5. A Novel Mobile Video Community Discovery Scheme Using Ontology-Based Semantical Interest Capture

    Directory of Open Access Journals (Sweden)

    Ruiling Zhang

    2016-01-01

    Full Text Available Leveraging network virtualization technologies, the community-based video systems rely on the measurement of common interests to define and steady relationship between community members, which promotes video sharing performance and improves scalability community structure. In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI. An ontology-based semantical extension approach is proposed, which describes video content and measures video similarity according to video key word selection methods. In order to reduce the calculation load of video similarity, VCOSI designs a prefix-filtering-based estimation algorithm to decrease energy consumption of mobile nodes. VCOSI further proposes a member relationship estimate method to construct scalable and resilient node communities, which promotes video sharing capacity of video systems with the flexible and economic community maintenance. Extensive tests show how VCOSI obtains better performance results in comparison with other state-of-the-art solutions.

  6. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

    Directory of Open Access Journals (Sweden)

    Hahn Daniel A

    2009-05-01

    Full Text Available Abstract Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (ES. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential microsatellite and SNP markers that could be used in population and systematic

  7. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    Directory of Open Access Journals (Sweden)

    Younhee Shin

    2016-09-01

    Full Text Available Hanwoo, a Korean native cattle (Bos taurus coreana, has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3% and 982,674 (40.9% novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1 and 28,613 SNPs (Btau 4.6.1 that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding.

  8. An evaluation of the utility of physiologically based models of pharmacokinetics in early drug discovery.

    Science.gov (United States)

    Parrott, Neil; Paquereau, Nicolas; Coassolo, Philippe; Lavé, Thierry

    2005-10-01

    Generic physiologically-based models of pharmacokinetics were evaluated for early drug discovery. Plasma profiles after intravenous and oral dosing were simulated in rat for 68 compounds from six chemical classes. Input data consisted of structure based predictions of lipophilicity, ionization, and protein binding plus intrinsic clearance measured in rat hepatocytes, single measured values of aqueous solubility, and artificial membrane permeability. LogP of compounds was high with a mean of 3.9 while free fraction in plasma (mean 9%) and solubility (mean 37 microg/mL) were low. Predicted and observed clearance and volume showed mean fold-error and R2 of 1.8, 0.56, and 1.9, 0.25 respectively. Predicted bioavailability showed strong bias to under prediction correlated to very low aqueous solubility and a theoretical correction for bile salt solubilization in vivo brought some improvement in average prediction error (to 31%). Overall, this evaluation shows that generic simulation may be applicable for typical drug-like compounds to predict differences in pharmacokinetic parameters of more than twofold based upon minimal measured input data. However verification of the simulations with in vivo data for a few compounds of each compound class is recommended since recent discovery compounds may have properties beyond the scope of the current generic models. Copyright (c) 2005 Wiley-Liss, Inc. and the American Pharmacists Association

  9. A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Science.gov (United States)

    Guerrero, Ginés D.; Imbernón, Baldomero; García, José M.

    2014-01-01

    Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC) platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs) has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO). This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor. PMID:25025055

  10. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products.

    Science.gov (United States)

    Gubbens, Jacob; Zhu, Hua; Girard, Geneviève; Song, Lijiang; Florea, Bogdan I; Aston, Philip; Ichinose, Koji; Filippov, Dmitri V; Choi, Young H; Overkleeft, Herman S; Challis, Gregory L; van Wezel, Gilles P

    2014-06-19

    Information on gene clusters for natural product biosynthesis is accumulating rapidly because of the current boom of available genome sequencing data. However, linking a natural product to a specific gene cluster remains challenging. Here, we present a widely applicable strategy for the identification of gene clusters for specific natural products, which we name natural product proteomining. The method is based on using fluctuating growth conditions that ensure differential biosynthesis of the bioactivity of interest. Subsequent combination of metabolomics and quantitative proteomics establishes correlations between abundance of natural products and concomitant changes in the protein pool, which allows identification of the relevant biosynthetic gene cluster. We used this approach to elucidate gene clusters for different natural products in Bacillus and Streptomyces, including a novel juglomycin-type antibiotic. Natural product proteomining does not require prior knowledge of the gene cluster or secondary metabolite and therefore represents a general strategy for identification of all types of gene clusters. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  12. Gene- and evidence-based candidate gene selection for schizophrenia and gene feature analysis.

    Science.gov (United States)

    Sun, Jingchun; Han, Leng; Zhao, Zhongming

    2010-01-01

    Schizophrenia is a chronic psychiatric disorder that affects about 1% of the population globally. A tremendous amount of effort has been expended in the past decade, including more than 2400 association studies, to identify genes influencing susceptibility to the disorder. However, few genes or markers have been reliably replicated. The wealth of this information calls for an integration of gene association data, evidence-based gene ranking, and follow-up replication in large sample. The objective of this study is to develop and evaluate evidence-based gene ranking methods and to examine the features of top-ranking candidate genes for schizophrenia. We proposed a gene-based approach for selecting and prioritizing candidate genes by combining odds ratios (ORs) of multiple markers in each association study and then combining ORs in multiple studies of a gene. We named it combination-combination OR method (CCOR). CCOR is similar to our recently published method, which first selects the largest OR of the markers in each study and then combines these ORs in multiple studies (i.e., selection-combination OR method, SCOR), but differs in selecting representative OR in each study. Features of top-ranking genes were examined by Gene Ontology terms and gene expression in tissues. Our evaluation suggested that the SCOR method overall outperforms the CCOR method. Using the SCOR, a list of 75 top-ranking genes was selected for schizophrenia candidate genes (SZGenes). We found that SZGenes had strong correlation with neuro-related functional terms and were highly expressed in brain-related tissues. The scientific landscape for schizophrenia genetics and other complex disease studies is expected to change dramatically in the next a few years, thus, the gene-based combined OR method is useful in candidate gene selection for follow-up association studies and in further artificial intelligence in medicine. This method for prioritization of candidate genes can be applied to other

  13. Discovery of PPi-type Phosphoenolpyruvate Carboxykinase Genes in Eukaryotes and Bacteria*

    Science.gov (United States)

    Chiba, Yoko; Kamikawa, Ryoma; Nakada-Tsukui, Kumiko; Saito-Nakano, Yumiko; Nozaki, Tomoyoshi

    2015-01-01

    Phosphoenolpyruvate carboxykinase (PEPCK) is one of the pivotal enzymes that regulates the carbon flow of the central metabolism by fixing CO2 to phosphoenolpyruvate (PEP) to produce oxaloacetate or vice versa. Whereas ATP- and GTP-type PEPCKs have been well studied, and their protein identities are established, inorganic pyrophosphate (PPi)-type PEPCK (PPi-PEPCK) is poorly characterized. Despite extensive enzymological studies, its protein identity and encoding gene remain unknown. In this study, PPi-PEPCK has been identified for the first time from a eukaryotic human parasite, Entamoeba histolytica, by conventional purification and mass spectrometric identification of the native enzyme, followed by demonstration of its enzymatic activity. A homolog of the amebic PPi-PEPCK from an anaerobic bacterium Propionibacterium freudenreichii subsp. shermanii also exhibited PPi-PEPCK activity. The primary structure of PPi-PEPCK has no similarity to the functional homologs ATP/GTP-PEPCKs and PEP carboxylase, strongly suggesting that PPi-PEPCK arose independently from the other functional homologues and very likely has unique catalytic sites. PPi-PEPCK homologs were found in a variety of bacteria and some eukaryotes but not in archaea. The molecular identification of this long forgotten enzyme shows us the diversity and functional redundancy of enzymes involved in the central metabolism and can help us to understand the central metabolism more deeply. PMID:26269598

  14. Discovery of PPi-type Phosphoenolpyruvate Carboxykinase Genes in Eukaryotes and Bacteria.

    Science.gov (United States)

    Chiba, Yoko; Kamikawa, Ryoma; Nakada-Tsukui, Kumiko; Saito-Nakano, Yumiko; Nozaki, Tomoyoshi

    2015-09-25

    Phosphoenolpyruvate carboxykinase (PEPCK) is one of the pivotal enzymes that regulates the carbon flow of the central metabolism by fixing CO2 to phosphoenolpyruvate (PEP) to produce oxaloacetate or vice versa. Whereas ATP- and GTP-type PEPCKs have been well studied, and their protein identities are established, inorganic pyrophosphate (PPi)-type PEPCK (PPi-PEPCK) is poorly characterized. Despite extensive enzymological studies, its protein identity and encoding gene remain unknown. In this study, PPi-PEPCK has been identified for the first time from a eukaryotic human parasite, Entamoeba histolytica, by conventional purification and mass spectrometric identification of the native enzyme, followed by demonstration of its enzymatic activity. A homolog of the amebic PPi-PEPCK from an anaerobic bacterium Propionibacterium freudenreichii subsp. shermanii also exhibited PPi-PEPCK activity. The primary structure of PPi-PEPCK has no similarity to the functional homologs ATP/GTP-PEPCKs and PEP carboxylase, strongly suggesting that PPi-PEPCK arose independently from the other functional homologues and very likely has unique catalytic sites. PPi-PEPCK homologs were found in a variety of bacteria and some eukaryotes but not in archaea. The molecular identification of this long forgotten enzyme shows us the diversity and functional redundancy of enzymes involved in the central metabolism and can help us to understand the central metabolism more deeply. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  15. The Analysis of Image Segmentation Hierarchies with a Graph-based Knowledge Discovery System

    Science.gov (United States)

    Tilton, James C.; Cooke, diane J.; Ketkar, Nikhil; Aksoy, Selim

    2008-01-01

    Currently available pixel-based analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. A general consensus is that object-based image analysis (OBIA) is required to effectively analyze this type of data. OBIA is usually a two-stage process; image segmentation followed by an analysis of the segmented objects. We are exploring an approach to OBIA in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software developed at NASA GSFC are analyzed by the Subdue graph-based knowledge discovery system developed by a team at Washington State University. In this paper we discuss out initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and provide results on real and simulated data. We also discuss planned improvements designed to more effectively and completely convey the hierarchical segmentation information to Subdue and to improve processing efficiency.

  16. Systems pharmacology-based drug discovery for marine resources: an example using sea cucumber (Holothurians).

    Science.gov (United States)

    Guo, Yingying; Ding, Yan; Xu, Feifei; Liu, Baoyue; Kou, Zinong; Xiao, Wei; Zhu, Jingbo

    2015-05-13

    Sea cucumber, a kind of marine animal, have long been utilized as tonic and traditional remedies in the Middle East and Asia because of its effectiveness against hypertension, asthma, rheumatism, cuts and burns, impotence, and constipation. In this study, an overall study performed on sea cucumber was used as an example to show drug discovery from marine resource by using systems pharmacology model. The value of marine natural resources has been extensively considered because these resources can be potentially used to treat and prevent human diseases. However, the discovery of drugs from oceans is difficult, because of complex environments in terms of composition and active mechanisms. Thus, a comprehensive systems approach which could discover active constituents and their targets from marine resource, understand the biological basis for their pharmacological properties is necessary. In this study, a feasible pharmacological model based on systems pharmacology was established to investigate marine medicine by incorporating active compound screening, target identification, and network and pathway analysis. As a result, 106 candidate components of sea cucumber and 26 potential targets were identified. Furthermore, the functions of sea cucumber in health improvement and disease treatment were elucidated in a holistic way based on the established compound-target and target-disease networks, and incorporated pathways. This study established a novel strategy that could be used to explore specific active mechanisms and discover new drugs from marine sources. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  17. Developing a distributed HTML5-based search engine for geospatial resource discovery

    Science.gov (United States)

    ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

    2013-12-01

    With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).

  18. Perbandingan antara Keefektifan Model Guided Discovery Learning dan Project-Based Learning pada Matakuliah Geometri

    Directory of Open Access Journals (Sweden)

    Okky Riswandha Imawan

    2015-12-01

    Abstract This research aims to describe the effectiveness and effectiveness differences of the Guided Discovery Learning (GDL Model and the Project Based Learning (PjBL Model in terms of achievement, self-confidence, and critical thinking skills of students on the Solid Geometry subjects. This research was quasi experimental. The research subjects were two undergraduate classes of Mathematics Education Program, Ahmad Dahlan University, in their second semester, established at random. The data analysis to test the effectiveness of the GDL and PjBL Models in terms of each of the dependent variables used the t-test. The data analysis to test differences between effectiveness of the GDL and that of the PjBL Model used the MANOVA test. The results of this research show that viewed from achievement, self confidence, and critical thinking skills of the students are the application of the GDL Model on Solid Geometry subject is effective, the application of the PjBL Model on Solid Geometry subject is effective, and there is no difference in the effectiveness of GDL and PjBL Models on Solid Geometry subject in terms of achievement, self confidence, and critical thinking skills of the students. Keywords: guided discovery learning model, project-based learning model, achievement, self-confidence, critical thinking skills

  19. A Smart Web-Based Geospatial Data Discovery System with Oceanographic Data as an Example

    Directory of Open Access Journals (Sweden)

    Yongyao Jiang

    2018-02-01

    Full Text Available Discovering and accessing geospatial data presents a significant challenge for the Earth sciences community as massive amounts of data are being produced on a daily basis. In this article, we report a smart web-based geospatial data discovery system that mines and utilizes data relevancy from metadata user behavior. Specifically, (1 the system enables semantic query expansion and suggestion to assist users in finding more relevant data; (2 machine-learned ranking is utilized to provide the optimal search ranking based on a number of identified ranking features that can reflect users’ search preferences; (3 a hybrid recommendation module is designed to allow users to discover related data considering metadata attributes and user behavior; (4 an integrated graphic user interface design is developed to quickly and intuitively guide data consumers to the appropriate data resources. As a proof of concept, we focus on a well-defined domain-oceanography and use oceanographic data discovery as an example. Experiments and a search example show that the proposed system can improve the scientific community’s data search experience by providing query expansion, suggestion, better search ranking, and data recommendation via a user-friendly interface.

  20. Microarray-based ultra-high resolution discovery of genomic deletion mutations

    Science.gov (United States)

    2014-01-01

    Background Oligonucleotide microarray-based comparative genomic hybridization (CGH) offers an attractive possible route for the rapid and cost-effective genome-wide discovery of deletion mutations. CGH typically involves comparison of the hybridization intensities of genomic DNA samples with microarray chip representations of entire genomes, and has widespread potential application in experimental research and medical diagnostics. However, the power to detect small deletions is low. Results Here we use a graduated series of Arabidopsis thaliana genomic deletion mutations (of sizes ranging from 4 bp to ~5 kb) to optimize CGH-based genomic deletion detection. We show that the power to detect smaller deletions (4, 28 and 104 bp) depends upon oligonucleotide density (essentially the number of genome-representative oligonucleotides on the microarray chip), and determine the oligonucleotide spacings necessary to guarantee detection of deletions of specified size. Conclusions Our findings will enhance a wide range of research and clinical applications, and in particular will aid in the discovery of genomic deletions in the absence of a priori knowledge of their existence. PMID:24655320

  1. Student Responses Toward Student Worksheets Based on Discovery Learning for Students with Intrapersonal and Interpersonal Intelligence

    Science.gov (United States)

    Yerizon, Y.; Putra, A. A.; Subhan, M.

    2018-04-01

    Students have a low mathematical ability because they are used to learning to hear the teacher's explanation. For that students are given activities to sharpen his ability in math. One way to do that is to create discovery learning based work sheet. The development of this worksheet took into account specific student learning styles including in schools that have classified students based on multiple intelligences. The dominant learning styles in the classroom were intrapersonal and interpersonal. The purpose of this study was to discover students’ responses to the mathematics work sheets of the junior high school with a discovery learning approach suitable for students with Intrapersonal and Interpersonal Intelligence. This tool was developed using a development model adapted from the Plomp model. The development process of this tools consists of 3 phases: front-end analysis/preliminary research, development/prototype phase and assessment phase. From the results of the research, it is found that students have good response to the resulting work sheet. The worksheet was understood well by students and its helps student in understanding the concept learned.

  2. The Fragile X Mental Retardation Syndrome 20 Years After the FMR1 Gene Discovery: an Expanding Universe of Knowledge

    Science.gov (United States)

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-01-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations. PMID:21912443

  3. Developing computer-based training programs for basic mammalian histology: Didactic versus discovery-based design

    Science.gov (United States)

    Fabian, Henry Joel

    Educators have long tried to understand what stimulates students to learn. The Swiss psychologist and zoologist, Jean Claude Piaget, suggested that students are stimulated to learn when they attempt to resolve confusion. He reasoned that students try to explain the world with the knowledge they have acquired in life. When they find their own explanations to be inadequate to explain phenomena, students find themselves in a temporary state of confusion. This prompts students to seek more plausible explanations. At this point, students are primed for learning (Piaget 1964). The Piagetian approach described above is called learning by discovery. To promote discovery learning, a teacher must first allow the student to recognize his misconception and then provide a plausible explanation to replace that misconception (Chinn and Brewer 1993). One application of this method is found in the various learning cycles, which have been demonstrated to be effective means for teaching science (Renner and Lawson 1973, Lawson 1986, Marek and Methven 1991, and Glasson & Lalik 1993). In contrast to the learning cycle, tutorial computer programs are generally not designed to correct student misconceptions, but rather follow a passive, didactic method of teaching. In the didactic or expositional method, the student is told about a phenomenon, but is neither encouraged to explore it, nor explain it in his own terms (Schneider and Renner 1980).

  4. Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining.

    Science.gov (United States)

    Hur, Junguk; Ozgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2012-12-20

    Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Over 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since

  5. A Survey on Data Storage and Information Discovery in the WSANs-Based Edge Computing Systems

    Science.gov (United States)

    Liang, Junbin; Liu, Renping; Ni, Wei; Li, Yin; Li, Ran; Ma, Wenpeng; Qi, Chuanda

    2018-01-01

    In the post-Cloud era, the proliferation of Internet of Things (IoT) has pushed the horizon of Edge computing, which is a new computing paradigm with data processed at the edge of the network. As the important systems of Edge computing, wireless sensor and actuator networks (WSANs) play an important role in collecting and processing the sensing data from the surrounding environment as well as taking actions on the events happening in the environment. In WSANs, in-network data storage and information discovery schemes with high energy efficiency, high load balance and low latency are needed because of the limited resources of the sensor nodes and the real-time requirement of some specific applications, such as putting out a big fire in a forest. In this article, the existing schemes of WSANs on data storage and information discovery are surveyed with detailed analysis on their advancements and shortcomings, and possible solutions are proposed on how to achieve high efficiency, good load balance, and perfect real-time performances at the same time, hoping that it can provide a good reference for the future research of the WSANs-based Edge computing systems. PMID:29439442

  6. SNP discovery in nonmodel organisms: strand bias and base-substitution errors reduce conversion rates.

    Science.gov (United States)

    Gonçalves da Silva, Anders; Barendse, William; Kijas, James W; Barris, Wes C; McWilliam, Sean; Bunch, Rowan J; McCullough, Russell; Harrison, Blair; Hoelzel, A Rus; England, Phillip R

    2015-07-01

    Single nucleotide polymorphisms (SNPs) have become the marker of choice for genetic studies in organisms of conservation, commercial or biological interest. Most SNP discovery projects in nonmodel organisms apply a strategy for identifying putative SNPs based on filtering rules that account for random sequencing errors. Here, we analyse data used to develop 4723 novel SNPs for the commercially important deep-sea fish, orange roughy (Hoplostethus atlanticus), to assess the impact of not accounting for systematic sequencing errors when filtering identified polymorphisms when discovering SNPs. We used SAMtools to identify polymorphisms in a velvet assembly of genomic DNA sequence data from seven individuals. The resulting set of polymorphisms were filtered to minimize 'bycatch'-polymorphisms caused by sequencing or assembly error. An Illumina Infinium SNP chip was used to genotype a final set of 7714 polymorphisms across 1734 individuals. Five predictors were examined for their effect on the probability of obtaining an assayable SNP: depth of coverage, number of reads that support a variant, polymorphism type (e.g. A/C), strand-bias and Illumina SNP probe design score. Our results indicate that filtering out systematic sequencing errors could substantially improve the efficiency of SNP discovery. We show that BLASTX can be used as an efficient tool to identify single-copy genomic regions in the absence of a reference genome. The results have implications for research aiming to identify assayable SNPs and build SNP genotyping assays for nonmodel organisms. © 2014 John Wiley & Sons Ltd.

  7. Discovery of accessible locations using region-based geo-social data

    KAUST Repository

    Wang, Yan

    2018-03-17

    Geo-social data plays a significant role in location discovery and recommendation. In this light, we propose and study a novel problem of discovering accessible locations in spatial networks using region-based geo-social data. Given a set Q of query regions, the top-k accessible location discovery query (k ALDQ) finds k locations that have the highest spatial-density correlations to Q. Both the spatial distances between locations and regions and the POI (point of interest) density within the regions are taken into account. We believe that this type of k ALDQ query can bring significant benefit to many applications such as travel planning, facility allocation, and urban planning. Three challenges exist in k ALDQ: (1) how to model the spatial-density correlation practically, (2) how to prune the search space effectively, and (3) how to schedule the searches from multiple query regions. To tackle the challenges and process k ALDQ effectively and efficiently, we first define a series of spatial and density metrics to model the spatial-density correlation. Then we propose a novel three-phase solution with a pair of upper and lower bounds of the spatial-density correlation and a heuristic scheduling strategy to schedule multiple query regions. Finally, we conduct extensive experiments on real and synthetic spatial data to demonstrate the performance of the developed solutions.

  8. HPV vaccines: their pathology-based discovery, benefits, and adverse effects.

    Science.gov (United States)

    Nicol, Alcina F; de Andrade, Cecilia V; Russomano, Fabio B; Rodrigues, Luana S L; Oliveira, Nathalia S; Provance, David William; Nuovo, Gerard J

    2015-12-01

    The discovery of the human papillomavirus (HPV) vaccine illustrates the power of in situ-based pathologic analysis in better understanding and curing diseases. The 2 available HPV vaccines have markedly reduced the incidence of cervical intraepithelial neoplasias, genital warts, and cervical cancer throughout the world. Concerns about HPV vaccine safety have led some physicians, health care officials, and parents to refuse providing the recommended vaccination to the target population. The aims of the study were to discuss the discovery of HPV vaccine and review scientific data related to measurable outcomes from the use of HPV vaccines. The strong type-specific immunity against HPV in humans has been known for more than 25 years. Multiple studies confirm the positive risk benefit of HPV vaccination with minimal documented adverse effects. The most common adverse effect, injection site pain, occurred in about 10% of girls and was less than the rate reported for other vaccines. Use of HPV vaccine should be expanded into more diverse populations, mainly in low-resource settings. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. A Survey on Data Storage and Information Discovery in the WSANs-Based Edge Computing Systems

    Directory of Open Access Journals (Sweden)

    Xingpo Ma

    2018-02-01

    Full Text Available In the post-Cloud era, the proliferation of Internet of Things (IoT has pushed the horizon of Edge computing, which is a new computing paradigm with data processed at the edge of the network. As the important systems of Edge computing, wireless sensor and actuator networks (WSANs play an important role in collecting and processing the sensing data from the surrounding environment as well as taking actions on the events happening in the environment. In WSANs, in-network data storage and information discovery schemes with high energy efficiency, high load balance and low latency are needed because of the limited resources of the sensor nodes and the real-time requirement of some specific applications, such as putting out a big fire in a forest. In this article, the existing schemes of WSANs on data storage and information discovery are surveyed with detailed analysis on their advancements and shortcomings, and possible solutions are proposed on how to achieve high efficiency, good load balance, and perfect real-time performances at the same time, hoping that it can provide a good reference for the future research of the WSANs-based Edge computing systems.

  10. A Survey on Data Storage and Information Discovery in the WSANs-Based Edge Computing Systems.

    Science.gov (United States)

    Ma, Xingpo; Liang, Junbin; Liu, Renping; Ni, Wei; Li, Yin; Li, Ran; Ma, Wenpeng; Qi, Chuanda

    2018-02-10

    In the post-Cloud era, the proliferation of Internet of Things (IoT) has pushed the horizon of Edge computing, which is a new computing paradigm with data are processed at the edge of the network. As the important systems of Edge computing, wireless sensor and actuator networks (WSANs) play an important role in collecting and processing the sensing data from the surrounding environment as well as taking actions on the events happening in the environment. In WSANs, in-network data storage and information discovery schemes with high energy efficiency, high load balance and low latency are needed because of the limited resources of the sensor nodes and the real-time requirement of some specific applications, such as putting out a big fire in a forest. In this article, the existing schemes of WSANs on data storage and information discovery are surveyed with detailed analysis on their advancements and shortcomings, and possible solutions are proposed on how to achieve high efficiency, good load balance, and perfect real-time performances at the same time, hoping that it can provide a good reference for the future research of the WSANs-based Edge computing systems.

  11. Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis.

    Science.gov (United States)

    Ayele, Mulu; Haas, Brian J; Kumar, Nikhil; Wu, Hank; Xiao, Yongli; Van Aken, Susan; Utterback, Teresa R; Wortman, Jennifer R; White, Owen R; Town, Christopher D

    2005-04-01

    Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering 283 Mb (0.44 x) of the estimated 650 Mb Brassica genome were searched against the Arabidopsis genome, and conserved Arabidopsis genome sequences (CAGSs) were identified. Of these 229,735 conserved regions, 167,357 fell within or intersected existing gene models, while 60,378 were located in previously unannotated regions. After removal of sequences matching known proteins, CAGSs that were close to one another were chained together as potentially comprising portions of the same functional unit. This resulted in 27,347 chains of which 15,686 were sufficiently distant from existing gene annotations to be considered a novel conserved unit. Of 192 conserved regions examined, 58 were found to be expressed in our cDNA populations. Rapid amplification of cDNA ends (RACE) was used to obtain potentially full-length transcripts from these 58 regions. The resulting sequences led to the creation of 21 gene models at 17 new Arabidopsis loci and the addition of splice variants or updates to another 19 gene structures. In addition, CAGSs overlapping already annotated genes in Arabidopsis can provide guidance for manual improvement of existing gene models. Published genome-wide expression data based on whole genome tiling arrays and massively parallel signature sequencing were overlaid on the Brassica-Arabidopsis conserved sequences, and 1399 regions of intersection were identified. Collectively our results and these data sets suggest that several thousand new Arabidopsis genes remain to be identified and annotated.

  12. From SOMAmer-based biomarker discovery to diagnostic and clinical applications: a SOMAmer-based, streamlined multiplex proteomic assay.

    Directory of Open Access Journals (Sweden)

    Stephan Kraemer

    Full Text Available Recently, we reported a SOMAmer-based, highly multiplexed assay for the purpose of biomarker identification. To enable seamless transition from highly multiplexed biomarker discovery assays to a format suitable and convenient for diagnostic and life-science applications, we developed a streamlined, plate-based version of the assay. The plate-based version of the assay is robust, sensitive (sub-picomolar, rapid, can be highly multiplexed (upwards of 60 analytes, and fully automated. We demonstrate that quantification by microarray-based hybridization, Luminex bead-based methods, and qPCR are each compatible with our platform, further expanding the breadth of proteomic applications for a wide user community.

  13. A DNA vector-based RNAi technology to suppress gene expression in mammalian cells.

    Science.gov (United States)

    Sui, Guangchao; Soohoo, Christina; Affar, El Bachir; Gay, Frédérique; Shi, Yujiang; Forrester, William C; Shi, Yang

    2002-04-16

    Double-stranded RNA-mediated interference (RNAi) has recently emerged as a powerful reverse genetic tool to silence gene expression in multiple organisms including plants, Caenorhabditis elegans, and Drosophila. The discovery that synthetic double-stranded, 21-nt small interfering RNA triggers gene-specific silencing in mammalian cells has further expanded the utility of RNAi into mammalian systems. Here we report a technology that allows synthesis of small interfering RNAs from DNA templates in vivo to efficiently inhibit endogenous gene expression. Significantly, we were able to use this approach to demonstrate, in multiple cell lines, robust inhibition of several endogenous genes of diverse functions. These findings highlight the general utility of this DNA vector-based RNAi technology in suppressing gene expression in mammalian cells.

  14. Systems-based Discovery of Tomatidine as a Natural Small Molecule Inhibitor of Skeletal Muscle Atrophy*

    Science.gov (United States)

    Dyle, Michael C.; Ebert, Scott M.; Cook, Daniel P.; Kunkel, Steven D.; Fox, Daniel K.; Bongers, Kale S.; Bullard, Steven A.; Dierdorff, Jason M.; Adams, Christopher M.

    2014-01-01

    Skeletal muscle atrophy is a common and debilitating condition that lacks an effective therapy. To address this problem, we used a systems-based discovery strategy to search for a small molecule whose mRNA expression signature negatively correlates to mRNA expression signatures of human skeletal muscle atrophy. This strategy identified a natural small molecule from tomato plants, tomatidine. Using cultured skeletal myotubes from both humans and mice, we found that tomatidine stimulated mTORC1 signaling and anabolism, leading to accumulation of protein and mitochondria, and ultimately, cell growth. Furthermore, in mice, tomatidine increased skeletal muscle mTORC1 signaling, reduced skeletal muscle atrophy, enhanced recovery from skeletal muscle atrophy, stimulated skeletal muscle hypertrophy, and increased strength and exercise capacity. Collectively, these results identify tomatidine as a novel small molecule inhibitor of muscle atrophy. Tomatidine may have utility as a therapeutic agent or lead compound for skeletal muscle atrophy. PMID:24719321

  15. BioTextQuest: a web-based biomedical text mining suite for concept discovery.

    Science.gov (United States)

    Papanikolaou, Nikolas; Pafilis, Evangelos; Nikolaou, Stavros; Ouzounis, Christos A; Iliopoulos, Ioannis; Promponas, Vasilis J

    2011-12-01

    BioTextQuest combines automated discovery of significant terms in article clusters with structured knowledge annotation, via Named Entity Recognition services, offering interactive user-friendly visualization. A tag-cloud-based illustration of terms labeling each document cluster are semantically annotated according to the biological entity, and a list of document titles enable users to simultaneously compare terms and documents of each cluster, facilitating concept association and hypothesis generation. BioTextQuest allows customization of analysis parameters, e.g. clustering/stemming algorithms, exclusion of documents/significant terms, to better match the biological question addressed. http://biotextquest.biol.ucy.ac.cy vprobon@ucy.ac.cy; iliopj@med.uoc.gr Supplementary data are available at Bioinformatics online.

  16. Comparison between project-based learning and discovery learning toward students' metacognitive strategies on global warming concept

    Science.gov (United States)

    Tumewu, Widya Anjelia; Wulan, Ana Ratna; Sanjaya, Yayan

    2017-05-01

    The purpose of this study was to know comparing the effectiveness of learning using Project-based learning (PjBL) and Discovery Learning (DL) toward students metacognitive strategies on global warming concept. A quasi-experimental research design with a The Matching-Only Pretest-Posttest Control Group Design was used in this study. The subjects were students of two classes 7th grade of one of junior high school in Bandung City, West Java of 2015/2016 academic year. The study was conducted on two experimental class, that were project-based learning treatment on the experimental class I and discovery learning treatment was done on the experimental class II. The data was collected through questionnaire to know students metacognitive strategies. The statistical analysis showed that there were statistically significant differences in students metacognitive strategies between project-based learning and discovery learning.

  17. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    Science.gov (United States)

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  18. Keefektifan setting TPS dalam pendekatan discovery learning dan problem-based learning pada pembelajaran materi lingkaran SMP

    Directory of Open Access Journals (Sweden)

    Rahmi Hidayati

    2017-05-01

    The purpose of this study was to describe the effectiveness of setting Think Pair Share (TPS in the approach to discovery learning and problem-based learning in terms of student achievement, mathematical communication skills, and interpersonal skills of the student.  This study was a quasi-experimental study using the pretest-posttest nonequivalent group design. The research population comprised all Year VIII students of SMP Negeri 1 Yogyakarta. The research sample was randomly selected from eight classes, two classes were elected. The instrument used in this study is the learning achievement test, a test of mathematical communication skills, and interpersonal skills student questionnaires. To test the effectiveness of setting Think Pair Share (TPS in the approach to discovery learning and problem-based learning, the one sample t-test was carried out. Then, to investigate the difference in effectiveness between the setting Think Pair Share (TPS in the approach to discovery learning and problem-based learning, the Multivariate Analysis of Variance (MANOVA was carried out. The research findings indicate that the setting TPS discovery approach to learning and problem-based approach to learning (PBL is effective in terms of learning achievement, mathematical communication skills, and interpersonal skills of the students. No difference in effectiveness between setting TPS discovery approach to learning and problem-based learning (PBL in terms of learning achievement, mathematical communication skills, and interpersonal skills of the students. Keywords: TPS setting in discovery learning approach, in problem-based learning, academic achievement, mathematical communication skills, and interpersonal skills of the student

  19. Topic model-based mass spectrometric data analysis in cancer biomarker discovery studies.

    Science.gov (United States)

    Wang, Minkun; Tsai, Tsung-Heng; Di Poto, Cristina; Ferrarini, Alessia; Yu, Guoqiang; Ressom, Habtom W

    2016-08-18

    A fundamental challenge in quantitation of biomolecules for cancer biomarker discovery is owing to the heterogeneous nature of human biospecimens. Although this issue has been a subject of discussion in cancer genomic studies, it has not yet been rigorously investigated in mass spectrometry based proteomic and metabolomic studies. Purification of mass spectometric data is highly desired prior to subsequent analysis, e.g., quantitative comparison of the abundance of biomolecules in biological samples. We investigated topic models to computationally analyze mass spectrometric data considering both integrated peak intensities and scan-level features, i.e., extracted ion chromatograms (EICs). Probabilistic generative models enable flexible representation in data structure and infer sample-specific pure resources. Scan-level modeling helps alleviate information loss during data preprocessing. We evaluated the capability of the proposed models in capturing mixture proportions of contaminants and cancer profiles on LC-MS based serum proteomic and GC-MS based tissue metabolomic datasets acquired from patients with hepatocellular carcinoma (HCC) and liver cirrhosis as well as synthetic data we generated based on the serum proteomic data. The results we obtained by analysis of the synthetic data demonstrated that both intensity-level and scan-level purification models can accurately infer the mixture proportions and the underlying true cancerous sources with small average error ratios (data, we found more proteins and metabolites with significant changes between HCC cases and cirrhotic controls. Candidate biomarkers selected after purification yielded biologically meaningful pathway analysis results and improved disease discrimination power in terms of the area under ROC curve compared to the results found prior to purification. We investigated topic model-based inference methods to computationally address the heterogeneity issue in samples analyzed by LC/GC-MS. We observed

  20. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  1. Functional gene-guided discovery of type II polyketides from culturable actinomycetes associated with soft coral Scleronephthya sp.

    Directory of Open Access Journals (Sweden)

    Wei Sun

    Full Text Available Compared with the actinomycetes in stone corals, the phylogenetic diversity of soft coral-associated culturable actinomycetes is essentially unexplored. Meanwhile, the knowledge of the natural products from coral-associated actinomycetes is very limited. In this study, thirty-two strains were isolated from the tissue of the soft coral Scleronephthya sp. in the East China Sea, which were grouped into eight genera by 16S rDNA phylogenetic analysis: Micromonospora, Gordonia, Mycobacterium, Nocardioides, Streptomyces, Cellulomonas, Dietzia and Rhodococcus. 6 Micromonospora strains and 4 Streptomyces strains were found to be with the potential for producing aromatic polyketides based on the analysis of KS(α (ketoacyl-synthase gene in the PKS II (type II polyketides synthase gene cluster. Among the 6 Micromonospora strains, angucycline cyclase gene was amplified in 2 strains (A5-1 and A6-2, suggesting their potential in synthesizing angucyclines e.g. jadomycin. Under the guidance of functional gene prediction, one jadomycin B analogue (7b, 13-dihydro-7-O-methyl jadomycin B was detected in the fermentation broth of Micromonospora sp. strain A5-1. This study highlights the phylogenetically diverse culturable actinomycetes associated with the tissue of soft coral Scleronephthya sp. and the potential of coral-derived actinomycetes especially Micromonospora in producing aromatic polyketides.

  2. Functional Gene-Guided Discovery of Type II Polyketides from Culturable Actinomycetes Associated with Soft Coral Scleronephthya sp

    Science.gov (United States)

    Sun, Wei; Peng, Chongsheng; Zhao, Yunyu; Li, Zhiyong

    2012-01-01

    Compared with the actinomycetes in stone corals, the phylogenetic diversity of soft coral-associated culturable actinomycetes is essentially unexplored. Meanwhile, the knowledge of the natural products from coral-associated actinomycetes is very limited. In this study, thirty-two strains were isolated from the tissue of the soft coral Scleronephthya sp. in the East China Sea, which were grouped into eight genera by 16S rDNA phylogenetic analysis: Micromonospora, Gordonia, Mycobacterium, Nocardioides, Streptomyces, Cellulomonas, Dietzia and Rhodococcus. 6 Micromonospora strains and 4 Streptomyces strains were found to be with the potential for producing aromatic polyketides based on the analysis of KSα (ketoacyl-synthase) gene in the PKS II (type II polyketides synthase) gene cluster. Among the 6 Micromonospora strains, angucycline cyclase gene was amplified in 2 strains (A5-1 and A6-2), suggesting their potential in synthesizing angucyclines e.g. jadomycin. Under the guidance of functional gene prediction, one jadomycin B analogue (7b, 13-dihydro-7-O-methyl jadomycin B) was detected in the fermentation broth of Micromonospora sp. strain A5-1. This study highlights the phylogenetically diverse culturable actinomycetes associated with the tissue of soft coral Scleronephthya sp. and the potential of coral-derived actinomycetes especially Micromonospora in producing aromatic polyketides. PMID:22880121

  3. Dynamic Structure-Based Pharmacophore Model Development: A New and Effective Addition in the Histone Deacetylase 8 (HDAC8 Inhibitor Discovery

    Directory of Open Access Journals (Sweden)

    Keun Woo Lee

    2011-12-01

    Full Text Available Histone deacetylase 8 (HDAC8 is an enzyme involved in deacetylating the amino groups of terminal lysine residues, thereby repressing the transcription of various genes including tumor suppressor gene. The over expression of HDAC8 was observed in many cancers and thus inhibition of this enzyme has emerged as an efficient cancer therapeutic strategy. In an effort to facilitate the future discovery of HDAC8 inhibitors, we developed two pharmacophore models containing six and five pharmacophoric features, respectively, using the representative structures from two molecular dynamic (MD simulations performed in Gromacs 4.0.5 package. Various analyses of trajectories obtained from MD simulations have displayed the changes upon inhibitor binding. Thus utilization of the dynamically-responded protein structures in pharmacophore development has the added advantage of considering the conformational flexibility of protein. The MD trajectories were clustered based on single-linkage method and representative structures were taken to be used in the pharmacophore model development. Active site complimenting structure-based pharmacophore models were developed using Discovery Studio 2.5 program and validated using a dataset of known HDAC8 inhibitors. Virtual screening of chemical database coupled with drug-like filter has identified drug-like hit compounds that match the pharmacophore models. Molecular docking of these hits reduced the false positives and identified two potential compounds to be used in future HDAC8 inhibitor design.

  4. Comprehensive analysis of differential genes and miRNA profiles for discovery of topping-responsive genes in flue-cured tobacco roots.

    Science.gov (United States)

    Qi, Yuancheng; Guo, Hongxiang; Li, Ke; Liu, Weiqun

    2012-03-01

    Decapitation/topping is an important cultivating measure for flue-cured tobacco, and diverse biology processes are changed to respond to the topping, such as hormonal balance, root development, source-sink relationship, ability of nicotine synthesis and stress tolerance. The purpose of this study was to clarify the molecular mechanism involved in the response of flue-cured tobacco to topping. The differentially expressed genes and micro RNAs (miRNAs) before and after topping were screened with a combination of suppression subtractive hybridization (SSH) and miRNA deep sequencing. In all, 560 differently expressed clones were sequenced by SSH, and then 129 high quality expressed sequence tags were acquired. These expressed sequence tags were mainly involved in secondary metabolism (13.5%), hormone metabolism (4%), signaling/transcription (17.5%), stress/defense (20%), protein metabolism (13%), carbon metabolism (7%), other metabolism (12%) and unknown function (13%). The results contribute new data to the list of possible candidate genes involved in the response of flue-cured tobacco to topping. NAC transcription factor, a differential gene identified by SSH, had been proved to have a role in the regulation of nicotine biosynthesis. High-throughput sequencing of two small RNA libraries in combination with SSH screening revealed 15 differential miRNAs whose target genes were identical to some differential genes identified in SSH, suggesting that miRNAs play a critical role in post-transcriptional gene regulation in the response of flue-cured tobacco to decapitation. Based on the role of these miRNAs and differential genes identified from SSH in response to topping, an miRNA mediated model for flue-cured tobacco in response to topping is proposed. © 2012 The Authors Journal compilation © 2012 FEBS.

  5. Agent Based Evidence Marshaling: Discovery-Based Enhancement Tools for C2 Systems

    Science.gov (United States)

    2003-12-01

    www5conf.inria.fr/fich_html/papers/P5/Overview.html, accessed on 1/15/2001. Dawkins , R., The Selfish Gene , Oxford University Press, Oxford, UK, 1989. Eco, U...Charles S. Peirce and Richard Dawkins argued that ideas can be alive and propagated through human life. Dawkins called these living ideas memes... Dawkins , 1989], while Peirce characterized them as “substantial things” [Buchler, 1955, 340]. This concept and the study of memetics that accompanies it

  6. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates.

    Science.gov (United States)

    Liu, Fang; Jenssen, Tor-Kristian; Trimarchi, Jeff; Punzo, Claudio; Cepko, Connie L; Ohno-Machado, Lucila; Hovig, Eivind; Kuo, Winston Patrick

    2007-06-07

    High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing). The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.

  7. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates

    Directory of Open Access Journals (Sweden)

    Cepko Connie L

    2007-06-01

    Full Text Available Abstract Background High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing. Results The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Conclusion Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.

  8. Recent mass spectrometry-based proteomics for biomarker discovery in lung cancer, COPD, and asthma.

    Science.gov (United States)

    Fujii, Kiyonaga; Nakamura, Haruhiko; Nishimura, Toshihide

    2017-04-01

    Lung cancer and related diseases have been one of the most common causes of deaths worldwide. Genomic-based biomarkers may hardly reflect the underlying dynamic molecular mechanism of functional protein interactions, which is the center of a disease. Recent developments in mass spectrometry (MS) have made it possible to analyze disease-relevant proteins expressed in clinical specimens by proteomic challenges. Areas covered: To understand the molecular mechanisms of lung cancer and its subtypes, chronic obstructive pulmonary disease (COPD), asthma and others, great efforts have been taken to identify numerous relevant proteins by MS-based clinical proteomic approaches. Since lung cancer is a multifactorial disease that is biologically associated with asthma and COPD among various lung diseases, this study focused on proteomic studies on biomarker discovery using various clinical specimens for lung cancer, COPD, and asthma. Expert commentary: MS-based exploratory proteomics utilizing clinical specimens, which can incorporate both experimental and bioinformatic analysis of protein-protein interaction and also can adopt proteogenomic approaches, makes it possible to reveal molecular networks that are relevant to a disease subgroup and that could differentiate between drug responders and non-responders, good and poor prognoses, drug resistance, and so on.

  9. Gun possession among American youth: a discovery-based approach to understand gun violence.

    Science.gov (United States)

    Ruggles, Kelly V; Rajan, Sonali

    2014-01-01

    To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions' Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001-2011). Specifically, we automated odds ratio calculations for 55 risk behaviors to assemble a comprehensive table of associations for every behavior combination. Downstream analyses included the hierarchical clustering of risk behaviors based on their association "fingerprint" to 1) visualize and assess which behaviors frequently co-occur and 2) evaluate which risk behaviors are consistently found to be associated with gun possession. From these analyses, we identified more than 40 behavioral factors, including heroin use, using snuff on school property, having been injured in a fight, and having been a victim of sexual violence, that have and continue to be strongly associated with gun possession. Additionally, we identified six behavioral clusters based on association similarities: 1) physical activity and nutrition; 2) disordered eating, suicide and sexual violence; 3) weapon carrying and physical safety; 4) alcohol, marijuana and cigarette use; 5) drug use on school property and 6) overall drug use. Use of computational methodologies identified multiple risk behaviors, beyond more commonly discussed indicators of poor mental health, that are associated with gun possession among youth. Implications for prevention efforts and future interdisciplinary work applying computational methods to behavioral science data are described.

  10. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes.

    Science.gov (United States)

    Ahvar, Ehsan; Lee, Gyu Myoung; Han, Son N; Crespi, Noel; Khan, Imran

    2016-06-27

    User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD) will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i) users' privacy; (ii) device-/tag-free; and (iii) fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation.

  11. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes

    Directory of Open Access Journals (Sweden)

    Ehsan Ahvar

    2016-06-01

    Full Text Available User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i users’ privacy; (ii device-/tag-free; and (iii fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation.

  12. De Novo Transcriptomic Analysis of an Oleaginous Microalga: Pathway Description and Gene Discovery for Production of Next-Generation Biofuels

    Science.gov (United States)

    Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu

    2012-01-01

    Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352

  13. Validity and Practitality of Acid-Base Module Based on Guided Discovery Learning for Senior High School

    Science.gov (United States)

    Yerimadesi; Bayharti; Jannah, S. M.; Lufri; Festiyed; Kiram, Y.

    2018-04-01

    This Research and Development(R&D) aims to produce guided discovery learning based module on topic of acid-base and determine its validity and practicality in learning. Module development used Four D (4-D) model (define, design, develop and disseminate).This research was performed until development stage. Research’s instruments were validity and practicality questionnaires. Module was validated by five experts (three chemistry lecturers of Universitas Negeri Padang and two chemistry teachers of SMAN 9 Padang). Practicality test was done by two chemistry teachers and 30 students of SMAN 9 Padang. Kappa Cohen’s was used to analyze validity and practicality. The average moment kappa was 0.86 for validity and those for practicality were 0.85 by teachers and 0.76 by students revealing high category. It can be concluded that validity and practicality was proven for high school chemistry learning.

  14. Organ/body-on-a-chip based on microfluidic technology for drug discovery.

    Science.gov (United States)

    Kimura, Hiroshi; Sakai, Yasuyuki; Fujii, Teruo

    2018-02-01

    Although animal experiments are indispensable for preclinical screening in the drug discovery process, various issues such as ethical considerations and species differences remain. To solve these issues, cell-based assays using human-derived cells have been actively pursued. However, it remains difficult to accurately predict drug efficacy, toxicity, and organs interactions, because cultivated cells often do not retain their original organ functions and morphologies in conventional in vitro cell culture systems. In the μTAS research field, which is a part of biochemical engineering, the technologies of organ-on-a-chip, based on microfluidic devices built using microfabrication, have been widely studied recently as a novel in vitro organ model. Since it is possible to physically and chemically mimic the in vitro environment by using microfluidic device technology, maintenance of cellular function and morphology, and replication of organ interactions can be realized using organ-on-a-chip devices. So far, functions of various organs and tissues, such as the lung, liver, kidney, and gut have been reproduced as in vitro models. Furthermore, a body-on-a-chip, integrating multi organ functions on a microfluidic device, has also been proposed for prediction of organ interactions. We herein provide a background of microfluidic systems, organ-on-a-chip, Body-on-a-chip technologies, and their challenges in the future. Copyright © 2017 The Japanese Society for the Study of Xenobiotics. Published by Elsevier Ltd. All rights reserved.

  15. An SDN-Based Authentication Mechanism for Securing Neighbor Discovery Protocol in IPv6

    Directory of Open Access Journals (Sweden)

    Yiqin Lu

    2017-01-01

    Full Text Available The Neighbor Discovery Protocol (NDP is one of the main protocols in the Internet Protocol version 6 (IPv6 suite, and it provides many basic functions for the normal operation of IPv6 in a local area network (LAN, such as address autoconfiguration and address resolution. However, it has many vulnerabilities that can be used by malicious nodes to launch attacks, because the NDP messages are easily spoofed without protection. Surrounding this problem, many solutions have been proposed for securing NDP, but these solutions either proposed new protocols that need to be supported by all nodes or built mechanisms that require the cooperation of all nodes, which is inevitable in the traditional distributed networks. Nevertheless, Software-Defined Networking (SDN provides a new perspective to think about protecting NDP. In this paper, we proposed an SDN-based authentication mechanism to verify the identity of NDP packets transmitted in a LAN. Using the centralized control and programmability of SDN, it can effectively prevent the spoofing attacks and other derived attacks based on spoofing. In addition, this mechanism needs no additional protocol supporting or configuration at hosts and routers and does not introduce any dedicated devices.

  16. Paper-based Synthetic Gene Networks

    Science.gov (United States)

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  17. Computational Materials Science and Chemistry: Accelerating Discovery and Innovation through Simulation-Based Engineering and Science

    Energy Technology Data Exchange (ETDEWEB)

    Crabtree, George [Argonne National Lab. (ANL), Argonne, IL (United States); Glotzer, Sharon [University of Michigan; McCurdy, Bill [University of California Davis; Roberto, Jim [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2010-07-26

    This report is based on a SC Workshop on Computational Materials Science and Chemistry for Innovation on July 26-27, 2010, to assess the potential of state-of-the-art computer simulations to accelerate understanding and discovery in materials science and chemistry, with a focus on potential impacts in energy technologies and innovation. The urgent demand for new energy technologies has greatly exceeded the capabilities of today's materials and chemical processes. To convert sunlight to fuel, efficiently store energy, or enable a new generation of energy production and utilization technologies requires the development of new materials and processes of unprecedented functionality and performance. New materials and processes are critical pacing elements for progress in advanced energy systems and virtually all industrial technologies. Over the past two decades, the United States has developed and deployed the world's most powerful collection of tools for the synthesis, processing, characterization, and simulation and modeling of materials and chemical systems at the nanoscale, dimensions of a few atoms to a few hundred atoms across. These tools, which include world-leading x-ray and neutron sources, nanoscale science facilities, and high-performance computers, provide an unprecedented view of the atomic-scale structure and dynamics of materials and the molecular-scale basis of chemical processes. For the first time in history, we are able to synthesize, characterize, and model materials and chemical behavior at the length scale where this behavior is controlled. This ability is transformational for the discovery process and, as a result, confers a significant competitive advantage. Perhaps the most spectacular increase in capability has been demonstrated in high performance computing. Over the past decade, computational power has increased by a factor of a million due to advances in hardware and software. This rate of improvement, which shows no sign of

  18. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    Science.gov (United States)

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  19. A family-based test for correlation between gene expression and trait values.

    Science.gov (United States)

    Kraft, Peter; Schadt, Eric; Aten, Jason; Horvath, Steve

    2003-05-01

    Advances in microarray technology have made it attractive to combine information on clinical traits, marker genotypes, and comprehensive gene expression from family studies to dissect complex disease genetics. Without accounting for family structure, methods that test for association between a trait and gene-expression levels can be misleading. We demonstrate that the standard unstratified test based on Pearson's correlation coefficient can produce spurious results when applied to family data, and we present a stratified family expression association test (FEXAT). We illustrate the utility of the FEXAT via simulation and an application to gene-expression data from lymphoblastoid cell lines from four CEPH families. The FEXAT has a smaller estimated false-discovery rate than the standard test when within-family correlations are of interest, and it detects biologically plausible correlations between beta catenin and genes in the WNT-activation pathway in humans that the standard test does not.

  20. The Goal Specificity Effect on Strategy Use and Instructional Efficiency during Computer-Based Scientific Discovery Learning

    Science.gov (United States)

    Kunsting, Josef; Wirth, Joachim; Paas, Fred

    2011-01-01

    Using a computer-based scientific discovery learning environment on buoyancy in fluids we investigated the "effects of goal specificity" (nonspecific goals vs. specific goals) for two goal types (problem solving goals vs. learning goals) on "strategy use" and "instructional efficiency". Our empirical findings close an important research gap,…

  1. Mobile STEMship Discovery Center: K-12 Aerospace-Based Science, Technology, Engineering, and Mathematics (STEM) Mobile Teaching Vehicle

    Science.gov (United States)

    2015-08-03

    AND SUBTITLE Mobile STEMship Discovery Center: K-12 Aerospace-Based Science, Technology, Engineering, and Mathematics (STEM) Mobile Teaching Vehicle...combat the STEM crisis in the areas math proficiencies, a lack of desire to participate STEM curriculum and careers as well as racial and gender

  2. Facilitating Students' Interaction with Real Gas Properties Using a Discovery-Based Approach and Molecular Dynamics Simulations

    Science.gov (United States)

    Sweet, Chelsea; Akinfenwa, Oyewumi; Foley, Jonathan J., IV

    2018-01-01

    We present an interactive discovery-based approach to studying the properties of real gases using simple, yet realistic, molecular dynamics software. Use of this approach opens up a variety of opportunities for students to interact with the behaviors and underlying theories of real gases. Students can visualize gas behavior under a variety of…

  3. The effect of discovery learning and problem-based learning on middle school students’ self-regulated learning

    Science.gov (United States)

    Miatun, A.; Muntazhimah

    2018-01-01

    The aim of this research was to determine the effect of learning models on mathematics achievement viewed from student’s self-regulated learning. The learning model compared were discovery learning and problem-based learning. The population was all students at the grade VIII of Junior High School in Boyolali regency. The samples were students of SMPN 4 Boyolali, SMPN 6 Boyolali, and SMPN 4 Mojosongo. The instruments used were mathematics achievement tests and self-regulated learning questionnaire. The data were analyzed using unbalanced two-ways Anova. The conclusion was as follows: (1) discovery learning gives better achievement than problem-based learning. (2) Achievement of students who have high self-regulated learning was better than students who have medium and low self-regulated learning. (3) For discovery learning, achievement of students who have high self-regulated learning was better than students who have medium and low self-regulated learning. For problem-based learning, students who have high and medium self-regulated learning have the same achievement. (4) For students who have high self-regulated learning, discovery learning gives better achievement than problem-based learning. Students who have medium and low self-regulated learning, both learning models give the same achievement.

  4. Handling hybrid and missing data in constraint-based causal discovery to study the etiology of ADHD.

    Science.gov (United States)

    Sokolova, Elena; von Rhein, Daniel; Naaijen, Jilly; Groot, Perry; Claassen, Tom; Buitelaar, Jan; Heskes, Tom

    2017-01-01

    Causal discovery is an increasingly important method for data analysis in the field of medical research. In this paper, we consider two challenges in causal discovery that occur very often when working with medical data: a mixture of discrete and continuous variables and a substantial amount of missing values. To the best of our knowledge, there are no methods that can handle both challenges at the same time. In this paper, we develop a new method that can handle these challenges based on the assumption that data are missing at random and that continuous variables obey a non-paranormal distribution. We demonstrate the validity of our approach for causal discovery on simulated data as well as on two real-world data sets from a monetary incentive delay task and a reversal learning task. Our results help in the understanding of the etiology of attention-deficit/hyperactivity disorder (ADHD).

  5. A cell-based high throughput screening assay for the discovery of cGAS-STING pathway agonists.

    Science.gov (United States)

    Liu, Bowei; Tang, Liudi; Zhang, Xiaohui; Ma, Julia; Sehgal, Mohit; Cheng, Junjun; Zhang, Xuexiang; Zhou, Yan; Du, Yanming; Kulp, John; Guo, Ju-Tao; Chang, Jinhong

    2017-11-01

    Stimulator of interferon genes (STING) is an endoplasmic reticulum transmembrane protein that serves as a molecular hub for activation of interferon and inflammatory cytokine response by multiple cellular DNA sensors. Not surprisingly, STING has been demonstrated to play an important role in host defense against microorganisms and pharmacologic activation of STING is considered as an attractive strategy to treat viral diseases and boost antitumor immunity. In light of this we established a HepAD38-derived reporter cell line that expresses firefly luciferase in response to the activation of cyclic GMP-AMP synthase (cGAS)-STING pathway for high throughput screening (HTS) of small molecular human STING agonists. This cell-based reporter assay required only 4 h treatment with a reference STING agonist to induce a robust luciferase signal and was demonstrated to have an excellent performance in HTS format. By screening 16,000 compounds, a dispiro diketopiperzine (DSDP) compound was identified to induce cytokine response in a manner dependent on the expression of functional human STING, but not mouse STING. Moreover, we showed that DSDP induced an interferon-dominant cytokine response in human skin fibroblasts and peripheral blood mononuclear cells, which in turn potently suppressed the replication of yellow fever virus, dengue virus and Zika virus. We have thus established a robust cell-based assay system suitable for rapid discovery and mechanistic analyses of cGAS-STING pathway agonists. Identification of DSDP as a human STING agonist enriches the pipelines of STING-targeting drug development for treatment of viral infections and cancers. Copyright © 2017. Published by Elsevier B.V.

  6. Grid-based Continual Analysis of Molecular Interior for Drug Discovery, QSAR and QSPR.

    Science.gov (United States)

    Potemkin, Andrey V; Grishina, Maria A; Potemkin, Vladimir A

    2017-01-01

    In 1979, R.D.Cramer and M.Milne made a first realization of 3D comparison of molecules by aligning them in space and by mapping their molecular fields to a 3D grid. Further, this approach was developed as the DYLOMMS (Dynamic Lattice- Oriented Molecular Modelling System) approach. In 1984, H.Wold and S.Wold proposed the use of partial least squares (PLS) analysis, instead of principal component analysis, to correlate the field values with biological activities. Then, in 1988, the method which was called CoMFA (Comparative Molecular Field Analysis) was introduced and the appropriate software became commercially available. Since 1988, a lot of 3D QSAR methods, algorithms and their modifications are introduced for solving of virtual drug discovery problems (e.g., CoMSIA, CoMMA, HINT, HASL, GOLPE, GRID, PARM, Raptor, BiS, CiS, ConGO,). All the methods can be divided into two groups (classes):1. Methods studying the exterior of molecules; 2) Methods studying the interior of molecules. A series of grid-based computational technologies for Continual Molecular Interior analysis (CoMIn) are invented in the current paper. The grid-based analysis is fulfilled by means of a lattice construction analogously to many other grid-based methods. The further continual elucidation of molecular structure is performed in various ways. (i) In terms of intermolecular interactions potentials. This can be represented as a superposition of Coulomb, Van der Waals interactions and hydrogen bonds. All the potentials are well known continual functions and their values can be determined in all lattice points for a molecule. (ii) In the terms of quantum functions such as electron density distribution, Laplacian and Hamiltonian of electron density distribution, potential energy distribution, the highest occupied and the lowest unoccupied molecular orbitals distribution and their superposition. To reduce time of calculations using quantum methods based on the first principles, an original quantum

  7. 3D profile-based approach to proteome-wide discovery of novel human chemokines.

    Directory of Open Access Journals (Sweden)

    Aurelie Tomczak

    Full Text Available Chemokines are small secreted proteins with important roles in immune responses. They consist of a conserved three-dimensional (3D structure, so-called IL8-like chemokine fold, which is supported by disulfide bridges characteristic of this protein family. Sequence- and profile-based computational methods have been proficient in discovering novel chemokines by making use of their sequence-conserved cysteine patterns. However, it has been recently shown that some chemokines escaped annotation by these methods due to low sequence similarity to known chemokines and to different arrangement of cysteines in sequence and in 3D. Innovative methods overcoming the limitations of current techniques may allow the discovery of new remote homologs in the still functionally uncharacterized fraction of the human genome. We report a novel computational approach for proteome-wide identification of remote homologs of the chemokine family that uses fold recognition techniques in combination with a scaffold-based automatic mapping of disulfide bonds to define a 3D profile of the chemokine protein family. By applying our methodology to all currently uncharacterized human protein sequences, we have discovered two novel proteins that, without having significant sequence similarity to known chemokines or characteristic cysteine patterns, show strong structural resemblance to known anti-HIV chemokines. Detailed computational analysis and experimental structural investigations based on mass spectrometry and circular dichroism support our structural predictions and highlight several other chemokine-like features. The results obtained support their functional annotation as putative novel chemokines and encourage further experimental characterization. The identification of remote homologs of human chemokines may provide new insights into the molecular mechanisms causing pathologies such as cancer or AIDS, and may contribute to the development of novel treatments. Besides

  8. Evidence based selection of housekeeping genes

    NARCIS (Netherlands)

    de Jonge, Hendrik J. M.; Fehrmann, Rudolf S. N.; de Bont, Eveline S. J. M.; Hofstra, Robert M. W.; Gerbens, Frans; Kamps, Willem A.; de Vries, Elisabeth G. E.; van der Zee, Ate G. J.; te Meerman, Gerard J.; ter Elst, Arja

    2007-01-01

    For accurate and reliable gene expression analysis, normalization of gene expression data against housekeeping genes (reference or internal control genes) is required. It is known that commonly used housekeeping genes (e. g. ACTB, GAPDH, HPRT1, and B2M) vary considerably under different experimental

  9. Literature-based discovery of diabetes- and ROS-related targets

    Directory of Open Access Journals (Sweden)

    Pande Manjusha

    2010-10-01

    Full Text Available Abstract Background Reactive oxygen species (ROS are known mediators of cellular damage in multiple diseases including diabetic complications. Despite its importance, no comprehensive database is currently available for the genes associated with ROS. Methods We present ROS- and diabetes-related targets (genes/proteins collected from the biomedical literature through a text mining technology. A web-based literature mining tool, SciMiner, was applied to 1,154 biomedical papers indexed with diabetes and ROS by PubMed to identify relevant targets. Over-represented targets in the ROS-diabetes literature were obtained through comparisons against randomly selected literature. The expression levels of nine genes, selected from the top ranked ROS-diabetes set, were measured in the dorsal root ganglia (DRG of diabetic and non-diabetic DBA/2J mice in order to evaluate the biological relevance of literature-derived targets in the pathogenesis of diabetic neuropathy. Results SciMiner identified 1,026 ROS- and diabetes-related targets from the 1,154 biomedical papers (http://jdrf.neurology.med.umich.edu/ROSDiabetes/. Fifty-three targets were significantly over-represented in the ROS-diabetes literature compared to randomly selected literature. These over-represented targets included well-known members of the oxidative stress response including catalase, the NADPH oxidase family, and the superoxide dismutase family of proteins. Eight of the nine selected genes exhibited significant differential expression between diabetic and non-diabetic mice. For six genes, the direction of expression change in diabetes paralleled enhanced oxidative stress in the DRG. Conclusions Literature mining compiled ROS-diabetes related targets from the biomedical literature and led us to evaluate the biological relevance of selected targets in the pathogenesis of diabetic neuropathy.

  10. Endophytes : Exploiting biodiversity for the improvement of natural product-based drug discovery

    NARCIS (Netherlands)

    Staniek, Agata; Woerdenbag, Herman J.; Kayser, Oliver

    2008-01-01

    Endophytes, microorganisms that colonize internal tissues of all plant species, create a huge biodiversity with yet unknown novel natural products, presumed to push forward the frontiers of drug discovery. Next to the clinically acknowledged antineoplastic agent, paclitaxel, endophyte research has

  11. A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model

    Directory of Open Access Journals (Sweden)

    Mickael Orgeur

    2018-01-01

    Full Text Available The sequence of the chicken genome, like several other draft genome sequences, is presently not fully covered. Gaps, contigs assigned with low confidence and uncharacterized chromosomes result in gene fragmentation and imprecise gene annotation. Transcript abundance estimation from RNA sequencing (RNA-seq data relies on read quality, library complexity and expression normalization. In addition, the quality of the genome sequence used to map sequencing reads, and the gene annotation that defines gene features, must also be taken into account. A partially covered genome sequence causes the loss of sequencing reads from the mapping step, while an inaccurate definition of gene features induces imprecise read counts from the assignment step. Both steps can significantly bias interpretation of RNA-seq data. Here, we describe a dual transcript-discovery approach combining a genome-guided gene prediction and a de novo transcriptome assembly. This dual approach enabled us to increase the assignment rate of RNA-seq data by nearly 20% as compared to when using only the chicken reference annotation, contributing therefore to a more accurate estimation of transcript abundance. More generally, this strategy could be applied to any organism with partial genome sequence and/or lacking a manually-curated reference annotation in order to improve the accuracy of gene expression studies.

  12. A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model.

    Science.gov (United States)

    Orgeur, Mickael; Martens, Marvin; Börno, Stefan T; Timmermann, Bernd; Duprez, Delphine; Stricker, Sigmar

    2018-01-17

    The sequence of the chicken genome, like several other draft genome sequences, is presently not fully covered. Gaps, contigs assigned with low confidence and uncharacterized chromosomes result in gene fragmentation and imprecise gene annotation. Transcript abundance estimation from RNA sequencing (RNA-seq) data relies on read quality, library complexity and expression normalization. In addition, the quality of the genome sequence used to map sequencing reads, and the gene annotation that defines gene features, must also be taken into account. A partially covered genome sequence causes the loss of sequencing reads from the mapping step, while an inaccurate definition of gene features induces imprecise read counts from the assignment step. Both steps can significantly bias interpretation of RNA-seq data. Here, we describe a dual transcript-discovery approach combining a genome-guided gene prediction and a de novo transcriptome assembly. This dual approach enabled us to increase the assignment rate of RNA-seq data by nearly 20% as compared to when using only the chicken reference annotation, contributing therefore to a more accurate estimation of transcript abundance. More generally, this strategy could be applied to any organism with partial genome sequence and/or lacking a manually-curated reference annotation in order to improve the accuracy of gene expression studies. © 2018. Published by The Company of Biologists Ltd.

  13. The application of mass-spectrometry-based protein biomarker discovery to theragnostics

    OpenAIRE

    Street, Jonathan M; Dear, James W

    2010-01-01

    Over the last decade rapid developments in mass spectrometry have allowed the identification of multiple proteins in complex biological samples. This proteomic approach has been applied to biomarker discovery in the context of clinical pharmacology (the combination of biomarker and drug now being termed ‘theragnostics’). In this review we provide a roadmap for early protein biomarker discovery studies, focusing on some key questions that regularly confront researchers.

  14. Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery

    Science.gov (United States)

    McCall, Laura-Isobel; Sarker, Malabika; Yadav, Maneesh; Ponder, Elizabeth L.; Kallel, E. Adam; Kellar, Danielle; Chen, Steven; Arkin, Michelle; Bunin, Barry A.; McKerrow, James H.; Talcott, Carolyn

    2015-01-01

    Background Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity. Methodology/Principal Findings In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs. PMID:26114876

  15. Contextual Approach with Guided Discovery Learning and Brain Based Learning in Geometry Learning

    Science.gov (United States)

    Kartikaningtyas, V.; Kusmayadi, T. A.; Riyadi

    2017-09-01

    The aim of this study was to combine the contextual approach with Guided Discovery Learning (GDL) and Brain Based Learning (BBL) in geometry learning of junior high school. Furthermore, this study analysed the effect of contextual approach with GDL and BBL in geometry learning. GDL-contextual and BBL-contextual was built from the steps of GDL and BBL that combined with the principles of contextual approach. To validate the models, it uses quasi experiment which used two experiment groups. The sample had been chosen by stratified cluster random sampling. The sample was 150 students of grade 8th in junior high school. The data were collected through the student’s mathematics achievement test that given after the treatment of each group. The data analysed by using one way ANOVA with different cell. The result shows that GDL-contextual has not different effect than BBL-contextual on mathematics achievement in geometry learning. It means both the two models could be used in mathematics learning as the innovative way in geometry learning.

  16. Systems-based discovery of tomatidine as a natural small molecule inhibitor of skeletal muscle atrophy.

    Science.gov (United States)

    Dyle, Michael C; Ebert, Scott M; Cook, Daniel P; Kunkel, Steven D; Fox, Daniel K; Bongers, Kale S; Bullard, Steven A; Dierdorff, Jason M; Adams, Christopher M

    2014-05-23

    Skeletal muscle atrophy is a common and debilitating condition that lacks an effective therapy. To address this problem, we used a systems-based discovery strategy to search for a small molecule whose mRNA expression signature negatively correlates to mRNA expression signatures of human skeletal muscle atrophy. This strategy identified a natural small molecule from tomato plants, tomatidine. Using cultured skeletal myotubes from both humans and mice, we found that tomatidine stimulated mTORC1 signaling and anabolism, leading to accumulation of protein and mitochondria, and ultimately, cell growth. Furthermore, in mice, tomatidine increased skeletal muscle mTORC1 signaling, reduced skeletal muscle atrophy, enhanced recovery from skeletal muscle atrophy, stimulated skeletal muscle hypertrophy, and increased strength and exercise capacity. Collectively, these results identify tomatidine as a novel small molecule inhibitor of muscle atrophy. Tomatidine may have utility as a therapeutic agent or lead compound for skeletal muscle atrophy. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  17. Discovery of an Oxybenzylglycine Based Peroxisome Proliferator Activated Receptor Alpha Selective

    Energy Technology Data Exchange (ETDEWEB)

    Li, J.; Kennedy, L; Shi, Y; Tao, S; Ye, X; Chen, S; Wang, Y; Hernandez, A; Wang, W; et al.

    2010-01-01

    An 1,3-oxybenzylglycine based compound 2 (BMS-687453) was discovered to be a potent and selective peroxisome proliferator activated receptor (PPAR) {alpha} agonist, with an EC{sub 50} of 10 nM for human PPAR{alpha} and {approx}410-fold selectivity vs human PPAR{gamma} in PPAR-GAL4 transactivation assays. Similar potencies and selectivity were also observed in the full length receptor co-transfection assays. Compound 2 has negligible cross-reactivity against a panel of human nuclear hormone receptors including PPAR{delta}. Compound 2 demonstrated an excellent pharmacological and safety profile in preclinical studies and thus was chosen as a development candidate for the treatment of atherosclerosis and dyslipidemia. The X-ray cocrystal structures of the early lead compound 12 and compound 2 in complex with PPAR{alpha} ligand binding domain (LBD) were determined. The role of the crystal structure of compound 12 with PPAR{alpha} in the development of the SAR that ultimately resulted in the discovery of compound 2 is discussed.

  18. Metabolomics-based chemotaxonomy of root endophytic fungi for natural products discovery.

    Science.gov (United States)

    Maciá-Vicente, Jose G; Shi, Yan-Ni; Cheikh-Ali, Zakaria; Grün, Peter; Glynou, Kyriaki; Kia, Sevda Haghi; Piepenbring, Meike; Bode, Helge B

    2018-03-01

    Fungi are prolific producers of natural products routinely screened for biotechnological applications, and those living endophytically within plants attract particular attention because of their purported chemical diversity. However, the harnessing of their biosynthetic potential is hampered by a large and often cryptic phylogenetic and ecological diversity, coupled with a lack of large-scale natural products' dereplication studies. To guide efforts to discover new chemistries among root-endophytic fungi, we analyzed the natural products produced by 822 strains using an untargeted UPLC-ESI-MS/MS-based approach and linked the patterns of chemical features to fungal lineages. We detected 17 809 compounds of which 7951 were classified in 1992 molecular families, whereas the remaining were considered unique chemistries. Our approach allowed to annotate 1191 compounds with different degrees of accuracy, many of which had known fungal origins. Approximately 61% of the compounds were specific of a fungal order, and differences were observed across lineages in the diversity and characteristics of their chemistries. Chemical profiles also showed variable chemosystematic values across lineages, ranging from relative homogeneity to high heterogeneity among related fungi. Our results provide an extensive resource to dereplicate fungal natural products and may assist future discovery programs by providing a guide for the selection of target fungi. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.

  19. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Directory of Open Access Journals (Sweden)

    Wynhoven Brian

    2011-03-01

    Full Text Available Abstract Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt, we have generated Expressed Sequence Tags (ESTs by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores and asexual (germinated urediniospores stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum, 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs. Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt and stripe rust, P. striiformis f. sp

  20. Immunophenotype Discovery, Hierarchical Organization, and Template-based Classification of Flow Cytometry Samples

    Directory of Open Access Journals (Sweden)

    Ariful Azad

    2016-08-01

    Full Text Available We describe algorithms for discovering immunophenotypes from large collections of flow cytometry (FC samples, and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters, a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples, while ignoring noise and small sample-specific variations.We have applied the template-base scheme to analyze several data setsincluding one representing a healthy immune system, and one of Acute Myeloid Leukemia (AMLsamples. The last task is challenging due to the phenotypic heterogeneity of the severalsubtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML, and were able to distinguish Acute Promyelocytic Leukemia from other subtypes of AML.

  1. Optimal design of cluster-based ad-hoc networks using probabilistic solution discovery

    Energy Technology Data Exchange (ETDEWEB)

    Cook, Jason L. [B62, QESA-ARDEC, Picatinny, NJ 07806 (United States)], E-mail: Jason.Cook1@us.army.mil; Ramirez-Marquez, Jose Emmanuel [Babbio Center, School of Systems and Enterprises, Stevens Institute of Technology, Castle Point on Hudson, Hoboken, NJ 07030 (United States)

    2009-02-15

    The reliability of ad-hoc networks is gaining popularity in two areas: as a topic of academic interest and as a key performance parameter for defense systems employing this type of network. The ad-hoc network is dynamic and scalable and these descriptions are what attract its users. However, these descriptions are also synonymous for undefined and unpredictable when considering the impacts to the reliability of the system. The configuration of an ad-hoc network changes continuously and this fact implies that no single mathematical expression or graphical depiction can describe the system reliability-wise. Previous research has used mobility and stochastic models to address this challenge successfully. In this paper, the authors leverage the stochastic approach and build upon it a probabilistic solution discovery (PSD) algorithm to optimize the topology for a cluster-based mobile ad-hoc wireless network (MAWN). Specifically, the membership of nodes within the back-bone network or networks will be assigned in such as way as to maximize reliability subject to a constraint on cost. The constraint may also be considered as a non-monetary cost, such as weight, volume, power, or the like. When a cost is assigned to each component, a maximum cost threshold is assigned to the network, and the method is run; the result is an optimized allocation of the radios enabling back-bone network(s) to provide the most reliable network possible without exceeding the allowable cost. The method is intended for use directly as part of the architectural design process of a cluster-based MAWN to efficiently determine an optimal or near-optimal design solution. It is capable of optimizing the topology based upon all-terminal reliability (ATR), all-operating terminal reliability (AoTR), or two-terminal reliability (2TR)

  2. Gun possession among American youth: a discovery-based approach to understand gun violence.

    Directory of Open Access Journals (Sweden)

    Kelly V Ruggles

    Full Text Available OBJECTIVE: To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions' Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. RESULTS: Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001-2011. Specifically, we automated odds ratio calculations for 55 risk behaviors to assemble a comprehensive table of associations for every behavior combination. Downstream analyses included the hierarchical clustering of risk behaviors based on their association "fingerprint" to 1 visualize and assess which behaviors frequently co-occur and 2 evaluate which risk behaviors are consistently found to be associated with gun possession. From these analyses, we identified more than 40 behavioral factors, including heroin use, using snuff on school property, having been injured in a fight, and having been a victim of sexual violence, that have and continue to be strongly associated with gun possession. Additionally, we identified six behavioral clusters based on association similarities: 1 physical activity and nutrition; 2 disordered eating, suicide and sexual violence; 3 weapon carrying and physical safety; 4 alcohol, marijuana and cigarette use; 5 drug use on school property and 6 overall drug use. CONCLUSIONS: Use of computational methodologies identified multiple risk behaviors, beyond more commonly discussed indicators of poor mental health, that are associated with gun possession among youth. Implications for prevention efforts and future interdisciplinary work applying computational methods to behavioral science data are described.

  3. The development of a valid discovery-based learning module to improve students' mathematical connection

    Science.gov (United States)

    Kuneni, Erna; Mardiyana, Pramudya, Ikrar

    2017-08-01

    Geometry is the most important branch in mathematics. The purpose of teaching this material is to develop students' level of thinking for a better understanding. Otherwise, geometry in particular, has contributed students' failure in mathematics examinations. This problem occurs due to special feature in geometry which has complexity of correlation among its concept. This relates to mathematical connection. It is still difficult for students to improve this ability. This is because teachers' lack in facilitating students towards it. Eventhough, facilitating students can be in the form of teaching material. A learning module can be a solution because it consists of series activities that should be taken by students to achieve a certain goal. A series activities in this case is adopted by the phases of discovery-based learning model. Through this module, students are facilitated to discover concept by deep instruction and guidance. It can build the mathematical habits of mind and also strengthen the mathematical connection. Method used in this research was ten stages of research and development proposed by Bord and Gall. The research purpose is to create a valid learning module to improve students' mathematical connection in teaching quadrilateral. The retrieved valid module based on media expert judgment is 2,43 for eligibility chart aspect, 2,60 for eligibility presentation aspect, and 3,00 for eligibility contents aspect. Then the retrieved valid module based on material expert judgment is 3,10 for eligibility content aspect, 2,87 for eligibility presentation aspect, and 2,80 for eligibility language and legibility aspect.

  4. Optimal design of cluster-based ad-hoc networks using probabilistic solution discovery

    International Nuclear Information System (INIS)

    Cook, Jason L.; Ramirez-Marquez, Jose Emmanuel

    2009-01-01

    The reliability of ad-hoc networks is gaining popularity in two areas: as a topic of academic interest and as a key performance parameter for defense systems employing this type of network. The ad-hoc network is dynamic and scalable and these descriptions are what attract its users. However, these descriptions are also synonymous for undefined and unpredictable when considering the impacts to the reliability of the system. The configuration of an ad-hoc network changes continuously and this fact implies that no single mathematical expression or graphical depiction can describe the system reliability-wise. Previous research has used mobility and stochastic models to address this challenge successfully. In this paper, the authors leverage the stochastic approach and build upon it a probabilistic solution discovery (PSD) algorithm to optimize the topology for a cluster-based mobile ad-hoc wireless network (MAWN). Specifically, the membership of nodes within the back-bone network or networks will be assigned in such as way as to maximize reliability subject to a constraint on cost. The constraint may also be considered as a non-monetary cost, such as weight, volume, power, or the like. When a cost is assigned to each component, a maximum cost threshold is assigned to the network, and the method is run; the result is an optimized allocation of the radios enabling back-bone network(s) to provide the most reliable network possible without exceeding the allowable cost. The method is intended for use directly as part of the architectural design process of a cluster-based MAWN to efficiently determine an optimal or near-optimal design solution. It is capable of optimizing the topology based upon all-terminal reliability (ATR), all-operating terminal reliability (AoTR), or two-terminal reliability (2TR)

  5. Gun Possession among American Youth: A Discovery-Based Approach to Understand Gun Violence

    Science.gov (United States)

    Ruggles, Kelly V.; Rajan, Sonali

    2014-01-01

    Objective To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions’ Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. Results Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001–2011). Specifically, we automated odds ratio calculations for 55 risk behaviors to assemble a comprehensive table of associations for every behavior combination. Downstream analyses included the hierarchical clustering of risk behaviors based on their association “fingerprint” to 1) visualize and assess which behaviors frequently co-occur and 2) evaluate which risk behaviors are consistently found to be associated with gun possession. From these analyses, we identified more than 40 behavioral factors, including heroin use, using snuff on school property, having been injured in a fight, and having been a victim of sexual violence, that have and continue to be strongly associated with gun possession. Additionally, we identified six behavioral clusters based on association similarities: 1) physical activity and nutrition; 2) disordered eating, suicide and sexual violence; 3) weapon carrying and physical safety; 4) alcohol, marijuana and cigarette use; 5) drug use on school property and 6) overall drug use. Conclusions Use of computational methodologies identified multiple risk behaviors, beyond more commonly discussed indicators of poor mental health, that are associated with gun possession among youth. Implications for prevention efforts and future interdisciplinary work applying computational methods to behavioral science data are described. PMID:25372864

  6. Transcriptome Analysis of the Oriental River Prawn, Macrobrachium nipponense Using 454 Pyrosequencing for Discovery of Genes and Markers

    Science.gov (United States)

    Ma, Keyi; Qiu, Gaofeng; Feng, Jianbin; Li, Jiale

    2012-01-01

    Background The oriental river prawn, Macrobrachium nipponense, is an economically and nutritionally important species of the Palaemonidae family of decapod crustaceans. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST) dataset for M. nipponense using high-throughput sequencing technologies. Methodology and Principal Findings Total RNA was isolated from eyestalk, gill, heart, ovary, testis, hepatopancreas, muscle, and embryos at the cleavage, gastrula, nauplius and zoea stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, we generated a total of 984,204 high quality reads (338.59Mb) with an average length of 344 bp. Clustering and assembly of these reads produced a non-redundant set of 81,411 unique sequences, comprising 42,551 contigs and 38,860 singletons. All of the unique sequences were involved in the molecular function (30,425), cellular component (44,112) and biological process (67,679) categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes involved in sex determination, including DMRT1, FTZ-F1, FOXL2, FEM1 and other potentially important candidate genes, were identified for the first time in this prawn. Furthermore, 6,689 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. Conclusions The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in M. nipponense. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating

  7. Transcriptome analysis of the oriental river prawn, Macrobrachium nipponense using 454 pyrosequencing for discovery of genes and markers.

    Directory of Open Access Journals (Sweden)

    Keyi Ma

    Full Text Available BACKGROUND: The oriental river prawn, Macrobrachium nipponense, is an economically and nutritionally important species of the Palaemonidae family of decapod crustaceans. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST dataset for M. nipponense using high-throughput sequencing technologies. METHODOLOGY AND PRINCIPAL FINDINGS: Total RNA was isolated from eyestalk, gill, heart, ovary, testis, hepatopancreas, muscle, and embryos at the cleavage, gastrula, nauplius and zoea stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, we generated a total of 984,204 high quality reads (338.59 Mb with an average length of 344 bp. Clustering and assembly of these reads produced a non-redundant set of 81,411 unique sequences, comprising 42,551 contigs and 38,860 singletons. All of the unique sequences were involved in the molecular function (30,425, cellular component (44,112 and biological process (67,679 categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes involved in sex determination, including DMRT1, FTZ-F1, FOXL2, FEM1 and other potentially important candidate genes, were identified for the first time in this prawn. Furthermore, 6,689 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. CONCLUSIONS: The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in M. nipponense. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for

  8. RNAi-Mediated Specific Gene Silencing as a Tool for the Discovery of New Drug Targets in Giardia lamblia; Evaluation Using the NADH Oxidase Gene

    Directory of Open Access Journals (Sweden)

    Jaime Marcial-Quino

    2017-11-01

    Full Text Available The microaerophilic protozoan Giardia lamblia is the agent causing giardiasis, an intestinal parasitosis of worldwide distribution. Different pharmacotherapies have been employed against giardiasis; however, side effects in the host and reports of drug resistant strains generate the need to develop new strategies that identify novel biological targets for drug design. To support this requirement, we have designed and evaluated a vector containing a cassette for the synthesis of double-stranded RNA (dsRNA, which can silence expression of a target gene through the RNA interference (RNAi pathway. Small silencing RNAs were detected and quantified in transformants expressing dsRNA by a stem-loop RT-qPCR approach. The results showed that, in transformants expressing dsRNA of 100–200 base pairs, the level of NADHox mRNA was reduced by around 30%, concomitant with a decrease in enzyme activity and a reduction in the number of trophozoites with respect to the wild type strain, indicating that NADHox is indeed an important enzyme for Giardia viability. These results suggest that it is possible to induce the G. lamblia RNAi machinery for attenuating the expression of genes encoding proteins of interest. We propose that our silencing strategy can be used to identify new potential drug targets, knocking down genes encoding different structural proteins and enzymes from a wide variety of metabolic pathways.

  9. RNAi-Mediated Specific Gene Silencing as a Tool for the Discovery of New Drug Targets in Giardia lamblia; Evaluation Using the NADH Oxidase Gene

    Science.gov (United States)

    Marcial-Quino, Jaime; Rufino-González, Yadira; Sierra-Palacios, Edgar; Vanoye-Carlo, America; González-Valdez, Abigail; Torres-Arroyo, Angélica; Oria-Hernández, Jesús; Reyes-Vivas, Horacio

    2017-01-01

    The microaerophilic protozoan Giardia lamblia is the agent causing giardiasis, an intestinal parasitosis of worldwide distribution. Different pharmacotherapies have been employed against giardiasis; however, side effects in the host and reports of drug resistant strains generate the need to develop new strategies that identify novel biological targets for drug design. To support this requirement, we have designed and evaluated a vector containing a cassette for the synthesis of double-stranded RNA (dsRNA), which can silence expression of a target gene through the RNA interference (RNAi) pathway. Small silencing RNAs were detected and quantified in transformants expressing dsRNA by a stem-loop RT-qPCR approach. The results showed that, in transformants expressing dsRNA of 100–200 base pairs, the level of NADHox mRNA was reduced by around 30%, concomitant with a decrease in enzyme activity and a reduction in the number of trophozoites with respect to the wild type strain, indicating that NADHox is indeed an important enzyme for Giardia viability. These results suggest that it is possible to induce the G. lamblia RNAi machinery for attenuating the expression of genes encoding proteins of interest. We propose that our silencing strategy can be used to identify new potential drug targets, knocking down genes encoding different structural proteins and enzymes from a wide variety of metabolic pathways. PMID:29099754

  10. Muscarinic receptors as model targets and antitargets for structure-based ligand discovery.

    Science.gov (United States)

    Kruse, Andrew C; Weiss, Dahlia R; Rossi, Mario; Hu, Jianxin; Hu, Kelly; Eitel, Katrin; Gmeiner, Peter; Wess, Jürgen; Kobilka, Brian K; Shoichet, Brian K

    2013-10-01

    G protein-coupled receptors (GPCRs) regulate virtually all aspects of human physiology and represent an important class of therapeutic drug targets. Many GPCR-targeted drugs resemble endogenous agonists, often resulting in poor selectivity among receptor subtypes and restricted pharmacologic profiles. The muscarinic acetylcholine receptor family exemplifies these problems; thousands of ligands are known, but few are receptor subtype-selective and nearly all are cationic in nature. Using structure-based docking against the M2 and M3 muscarinic receptors, we screened 3.1 million molecules for ligands with new physical properties, chemotypes, and receptor subtype selectivities. Of 19 docking-prioritized molecules tested against the M2 subtype, 11 had substantial activity and 8 represented new chemotypes. Intriguingly, two were uncharged ligands with low micromolar to high nanomolar Ki values, an observation with few precedents among aminergic GPCRs. To exploit a single amino-acid substitution among the binding pockets between the M2 and M3 receptors, we selected molecules predicted by docking to bind to the M3 and but not the M2 receptor. Of 16 molecules tested, 8 bound to the M3 receptor. Whereas selectivity remained modest for most of these, one was a partial agonist at the M3 receptor without measurable M2 agonism. Consistent with this activity, this compound stimulated insulin release from a mouse β-cell line. These results support the ability of structure-based discovery to identify new ligands with unexplored chemotypes and physical properties, leading to new biologic functions, even in an area as heavily explored as muscarinic pharmacology.

  11. Pars distalis vasculature: Discovery Shuttle STS-29 rats compared to ground-based antiorthostatic rats.

    Science.gov (United States)

    Pattison, A; Pattison, T; Schechter, J

    1991-11-01

    The anterior pituitary glands of male, adult Long Evans rats carried 5 days in the Space Shuttle Discovery (STS-29) have been compared with two groups of ground-based controls. All of the animals were part of a study (SE82-08) into the effects of gravity versus a microgravity environment on fracture healing. All had sustained a right, mid-shaft fibular osteotomy. The duration of the study was 10 days, and animals in all groups were weight bearing for the 5 days prior to shuttle lift off. The three experimental groups consisted of four rats each: flight (F) and two ground-based control groups, weight bearing (WB) and suspended (S). The suspension group was in a Holton/Sweeney head-down suspension apparatus (antiorthostatic) for the final 5 days of the study. The anterior pituitary glands of F and WB rats were essentially identical. The vasculature and parenchymal cells appeared unaffected in both instances. However, the anterior pituitary glands of S rats were dramatically altered. The vasculature was widely expanded with proteinaceous deposition covering the lumenal endothelial surfaces, and entrapping numerous platelets and aggregates of red blood cells. Parenchymal cells were highly vacuolated, occasionally with membranous vacuoles, but most often revealing large, clear cytoplasmic zones unlined by any membranes. Whereas profiles of exocytosis were numerous in F rats, and present in WB rats, they were essentially absent in S rats. These results indicate that weightlessness over a 5-day flight period does not influence the structural integrity of the anterior pituitary gland and may in fact promote secretory granule release. However, the head-down tilt model, frequently used to study fracture repair under conditions that mimic weightlessness, has a profound impact on the vasculature of the anterior pituitary gland which then affects the structural and functional characteristics of the parenchymal cells.

  12. Early detection of pharmacovigilance signals with automated methods based on false discovery rates: a comparative study.

    Science.gov (United States)

    Ahmed, Ismaïl; Thiessard, Frantz; Miremont-Salamé, Ghada; Haramburu, Françoise; Kreft-Jais, Carmen; Bégaud, Bernard; Tubert-Bitter, Pascale

    2012-06-01

    Improving the detection of drug safety signals has led several pharmacovigilance regulatory agencies to incorporate automated quantitative methods into their spontaneous reporting management systems. The three largest worldwide pharmacovigilance databases are routinely screened by the lower bound of the 95% confidence interval of proportional reporting ratio (PRR₀₂.₅), the 2.5% quantile of the Information Component (IC₀₂.₅) or the 5% quantile of the Gamma Poisson Shrinker (GPS₀₅). More recently, Bayesian and non-Bayesian False Discovery Rate (FDR)-based methods were proposed that address the arbitrariness of thresholds and allow for a built-in estimate of the FDR. These methods were also shown through simulation studies to be interesting alternatives to the currently used methods. The objective of this work was twofold. Based on an extensive retrospective study, we compared PRR₀₂.₅, GPS₀₅ and IC₀₂.₅ with two FDR-based methods derived from the Fisher's exact test and the GPS model (GPS(pH0) [posterior probability of the null hypothesis H₀ calculated from the Gamma Poisson Shrinker model]). Secondly, restricting the analysis to GPS(pH0), we aimed to evaluate the added value of using automated signal detection tools compared with 'traditional' methods, i.e. non-automated surveillance operated by pharmacovigilance experts. The analysis was performed sequentially, i.e. every month, and retrospectively on the whole French pharmacovigilance database over the period 1 January 1996-1 July 2002. Evaluation was based on a list of 243 reference signals (RSs) corresponding to investigations launched by the French Pharmacovigilance Technical Committee (PhVTC) during the same period. The comparison of detection methods was made on the basis of the number of RSs detected as well as the time to detection. Results comparing the five automated quantitative methods were in favour of GPS(pH0) in terms of both number of detections of true signals and

  13. Classification of melanomas in situ using knowledge discovery with explained case-based reasoning.

    Science.gov (United States)

    Armengol, Eva

    2011-02-01

    Early diagnosis of melanoma is based on the ABCD rule which considers asymmetry, border irregularity, color variegation, and a diameter larger than 5mm as the characteristic features of melanomas. When a skin lesion presents these features it is excised as prevention. Using a non-invasive technique called dermoscopy, dermatologists can give a more accurate evaluation of skin lesions, and can therefore avoid the excision of lesions that are benign. However, dermatologists need to achieve a good dermatoscopic classification of lesions prior to extraction. In this paper we propose a procedure called LazyCL to support dermatologists in assessing the classification of skin lesions. Our goal is to use LazyCL for generating a domain theory to classify melanomas in situ. To generate a domain theory, the LazyCL procedure uses a combination of two artificial intelligence techniques: case-based reasoning and clustering. First LazyCL randomly creates clusters and then uses a lazy learning method called lazy induction of descriptions (LID) with leave-one-out on them. By means of LID, LazyCL collects explanations of why the cases in the database should belong to a class. Then the analysis of relationships among explanations produces an understandable clustering of the dataset. After a process of elimination of redundancies and merging of clusters, the set of explanations is reduced to a subset of it describing classes that are "almost" discriminant. The remaining explanations form a preliminary domain theory that is the basis on which experts can perform knowledge discovery. We performed two kinds of experiments. First ones consisted on using LazyCL on a database containing the description of 76 melanomas. The domain theory obtained from these experiments was compared on previous experiments performed using a different clustering method called self-organizing maps (SOM). Results of both methods, LazyCL and SOM, were similar. The second kind of experiments consisted on using Lazy

  14. Topology Discovery Using Cisco Discovery Protocol

    OpenAIRE

    Rodriguez, Sergio R.

    2009-01-01

    In this paper we address the problem of discovering network topology in proprietary networks. Namely, we investigate topology discovery in Cisco-based networks. Cisco devices run Cisco Discovery Protocol (CDP) which holds information about these devices. We first compare properties of topologies that can be obtained from networks deploying CDP versus Spanning Tree Protocol (STP) and Management Information Base (MIB) Forwarding Database (FDB). Then we describe a method of discovering topology ...

  15. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    Science.gov (United States)

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  16. An agent-based peer-to-peer architecture for semantic discovery of manufacturing services across virtual enterprises

    Science.gov (United States)

    Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu

    2015-04-01

    With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.

  17. Common minor histocompatibility antigen discovery based upon patient clinical outcomes and genomic data.

    Directory of Open Access Journals (Sweden)

    Paul M Armistead

    Full Text Available Minor histocompatibility antigens (mHA mediate much of the graft vs. leukemia (GvL effect and graft vs. host disease (GvHD in patients who undergo allogeneic stem cell transplantation (SCT. Therapeutic decision making and treatments based upon mHAs will require the evaluation of multiple candidate mHAs and the selection of those with the potential to have the greatest impact on clinical outcomes. We hypothesized that common, immunodominant mHAs, which are presented by HLA-A, B, and C molecules, can mediate clinically significant GvL and/or GvHD, and that these mHAs can be identified through association of genomic data with clinical outcomes.Because most mHAs result from donor/recipient cSNP disparities, we genotyped 57 myeloid leukemia patients and their donors at 13,917 cSNPs. We correlated the frequency of genetically predicted mHA disparities with clinical evidence of an immune response and then computationally screened all peptides mapping to the highly associated cSNPs for their ability to bind to HLA molecules. As proof-of-concept, we analyzed one predicted antigen, T4A, whose mHA mismatch trended towards improved overall and disease free survival in our cohort. T4A mHA mismatches occurred at the maximum theoretical frequency for any given SCT. T4A-specific CD8+ T lymphocytes (CTLs were detected in 3 of 4 evaluable post-transplant patients predicted to have a T4A mismatch.Our method is the first to combine clinical outcomes data with genomics and bioinformatics methods to predict and confirm a mHA. Refinement of this method should enable the discovery of clinically relevant mHAs in the majority of transplant patients and possibly lead to novel immunotherapeutics.

  18. LOBSTAHS: An Adduct-Based Lipidomics Strategy for Discovery and Identification of Oxidative Stress Biomarkers.

    Science.gov (United States)

    Collins, James R; Edwards, Bethanie R; Fredricks, Helen F; Van Mooy, Benjamin A S

    2016-07-19

    Discovery and identification of molecular biomarkers in large LC/MS data sets requires significant automation without loss of accuracy in the compound screening and annotation process. Here, we describe a lipidomics workflow and open-source software package for high-throughput annotation and putative identification of lipid, oxidized lipid, and oxylipin biomarkers in high-mass-accuracy HPLC-MS data. Lipid and oxylipin biomarker screening through adduct hierarchy sequences, or LOBSTAHS, uses orthogonal screening criteria based on adduct ion formation patterns and other properties to identify thousands of compounds while providing the user with a confidence score for each assignment. Assignments are made from one of two customizable databases; the default databases contain 14 068 unique entries. To demonstrate the software's functionality, we screened more than 340 000 mass spectral features from an experiment in which hydrogen peroxide was used to induce oxidative stress in the marine diatom Phaeodactylum tricornutum. LOBSTAHS putatively identified 1969 unique parent compounds in 21 869 features that survived the multistage screening process. While P. tricornutum maintained more than 92% of its core lipidome under oxidative stress, patterns in biomarker distribution and abundance indicated remodeling was both subtle and pervasive. Treatment with 150 μM H2O2 promoted statistically significant carbon-chain elongation across lipid classes, with the strongest elongation accompanying oxidation in moieties of monogalactosyldiacylglycerol, a lipid typically localized to the chloroplast. Oxidative stress also induced a pronounced reallocation of lipidome peak area to triacylglycerols. LOBSTAHS can be used with environmental or experimental data from a variety of systems and is freely available at https://github.com/vanmooylipidomics/LOBSTAHS .

  19. A two-genome microarray for the rice pathogens Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola and its use in the discovery of a difference in their regulation of hrp genes

    Directory of Open Access Journals (Sweden)

    Lin Ye

    2008-06-01

    Full Text Available Abstract Background Xanthomonas oryzae pv. oryzae (Xoo and X. oryzae pv. oryzicola (Xoc are bacterial pathogens of the worldwide staple and grass model, rice. Xoo and Xoc are closely related but Xoo invades rice vascular tissue to cause bacterial leaf blight, a serious disease of rice in many parts of the world, and Xoc colonizes the mesophyll parenchyma to cause bacterial leaf streak, a disease of emerging importance. Both pathogens depend on hrp genes for type III secretion to infect their host. We constructed a 50–70 mer oligonucleotide microarray based on available genome data for Xoo and Xoc and compared gene expression in Xoo strains PXO99A and Xoc strain BLS256 grown in the rich medium PSB vs. XOM2, a minimal medium previously reported to induce hrp genes in Xoo strain T7174. Results Three biological replicates of the microarray experiment to compare global gene expression in representative strains of Xoo and Xoc grown in PSB vs. XOM2 were carried out. The non-specific error rate and the correlation coefficients across biological replicates and among duplicate spots revealed that the microarray data were robust. 247 genes of Xoo and 39 genes of Xoc were differentially expressed in the two media with a false discovery rate of 5% and with a minimum fold-change of 1.75. Semi-quantitative-RT-PCR assays confirmed differential expression of each of 16 genes each for Xoo and Xoc selected for validation. The differentially expressed genes represent 17 functional categories. Conclusion We describe here the construction and validation of a two-genome microarray for the two pathovars of X. oryzae. Microarray analysis revealed that using representative strains, a greater number of Xoo genes than Xoc genes are differentially expressed in XOM2 relative to PSB, and that these include hrp genes and other genes important in interactions with rice. An exception was the rax genes, which are required for production of the host resistance elicitor AvrXa21

  20. Multiobjective differential evolution-based multifactor dimensionality reduction for detecting gene-gene interactions.

    Science.gov (United States)

    Yang, Cheng-Hong; Chuang, Li-Yeh; Lin, Yu-Da

    2017-10-09

    Epistasis within disease-related genes (gene-gene interactions) was determined through contingency table measures based on multifactor dimensionality reduction (MDR) using single-nucleotide polymorphisms (SNPs). Most MDR-based methods use the single contingency table measure to detect gene-gene interactions; however, some gene-gene interactions may require identification through multiple contingency table measures. In this study, a multiobjective differential evolution method (called MODEMDR) was proposed to merge the various contingency table measures based on MDR to detect significant gene-gene interactions. Two contingency table measures, namely the correct classification rate and normalized mutual information, were selected to design the fitness functions in MODEMDR. The characteristics of multiobjective optimization enable MODEMDR to use multiple measures to efficiently and synchronously detect significant gene-gene interactions within a reasonable time frame. Epistatic models with and without marginal effects under various parameter settings (heritability and minor allele frequencies) were used to assess existing methods by comparing the detection success rates of gene-gene interactions. The results of the simulation datasets show that MODEMDR is superior to existing methods. Moreover, a large dataset obtained from the Wellcome Trust Case Control Consortium was used to assess MODEMDR. MODEMDR exhibited efficiency in identifying significant gene-gene interactions in genome-wide association studies.

  1. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    The price discovery literature investigates how homogenous securities traded on different markets incorporate information into prices. We take this literature one step further and investigate how these markets contribute to stochastic volatility (volatility discovery). We formally show...... that the realized measures from homogenous securities share a fractional stochastic trend, which is a combination of the price and volatility discovery measures. Furthermore, we show that volatility discovery is associated with the way that market participants process information arrival (market sensitivity......). Finally, we compute volatility discovery for 30 actively traded stocks in the U.S. and report that Nyse and Arca dominate Nasdaq....

  2. The Analysis of Multiple Genome Comparisons in Genus Escherichia and Its Application to the Discovery of Uncharacterised Metabolic Genes in Uropathogenic Escherichia coli CFT073

    Directory of Open Access Journals (Sweden)

    William A. Bryant

    2009-01-01

    Full Text Available A survey of a complete gene synteny comparison has been carried out between twenty fully sequenced strains from the genus Escherichia with the aim of finding yet uncharacterised genes implicated in the metabolism of uropathogenic strains of E. coli (UPEC. Several sets of adjacent colinear genes have been identified which are present in all four UPEC included in this study (CFT073, F11, UTI89, and 536, annotated with putative metabolic functions, but are not found in any other strains considered. An operon closely homologous to that encoding the L-sorbose degradation pathway in Klebsiella pneumoniae has been identified in E. coli CFT073; this operon is present in all of the UPEC considered, but only in 7 of the other 16 strains. The operon's function has been confirmed by cloning the genes into E. coli DH5α and testing for growth on L-sorbose. The functional genomic approach combining in silico and in vitro work presented here can be used as a basis for the discovery of other uncharacterised genes contributing to bacterial survival in specific environments.

  3. Genome-Based Studies of Marine Microorganisms to Maximize the Diversity of Natural Products Discovery for Medical Treatments

    Directory of Open Access Journals (Sweden)

    Xin-Qing Zhao

    2011-01-01

    Full Text Available Marine microorganisms are rich source for natural products which play important roles in pharmaceutical industry. Over the past decade, genome-based studies of marine microorganisms have unveiled the tremendous diversity of the producers of natural products and also contributed to the efficiency of harness the strain diversity and chemical diversity, as well as the genetic diversity of marine microorganisms for the rapid discovery and generation of new natural products. In the meantime, genomic information retrieved from marine symbiotic microorganisms can also be employed for the discovery of new medical molecules from yet-unculturable microorganisms. In this paper, the recent progress in the genomic research of marine microorganisms is reviewed; new tools of genome mining as well as the advance in the activation of orphan pathways and metagenomic studies are summarized. Genome-based research of marine microorganisms will maximize the biodiscovery process and solve the problems of supply and sustainability of drug molecules for medical treatments.

  4. Energy-Efficient Cluster-Based Service Discovery in Wireless Sensor Networks

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Scholten, Johan; Havinga, Paul J.M.; Hartel, Pieter H.

    We propose an energy-efficient service discovery protocol for wireless sensor networks. Our solution exploits a cluster overlay, where the clusterhead nodes form a distributed service registry. A service lookup results in visiting only the clusterhead nodes. We aim for minimizing the communication

  5. Energy-Efficient Cluster-Based Service Discovery in Wireless Sensor Networks

    NARCIS (Netherlands)

    Marin Perianu, Raluca; Scholten, Johan; Havinga, Paul J.M.; Hartel, Pieter H.

    2006-01-01

    We propose an energy-efficient service discovery protocol for wireless sensor networks. Our solution exploits a cluster overlay, where the clusterhead nodes form a distributed service registry. A service lookup results in visiting only the clusterhead nodes. We aim for minimizing the communication

  6. Towards a goal-based service framework for dynamic service discovery and composition

    NARCIS (Netherlands)

    Bonino da Silva Santos, L.O.; Goncalves da Silva, Eduardo; Ferreira Pires, Luis; van Sinderen, Marten J.

    2009-01-01

    Service-Oriented Computing allows new applications to be developed by using and/or combining services offered by different providers. Service discovery and composition are performed aiming to comply with the client’s request in terms of functionality and expected outcome. In this paper we present a

  7. Optimizing Neighbor Discovery for Ad hoc Networks based on the Bluetooth PAN Profile

    DEFF Research Database (Denmark)

    Kuijpers, Gerben; Nielsen, Thomas Toftegaard; Prasad, Ramjee

    2002-01-01

    . This paper introduces a neighbor discovery mechanism that utilizes the resources in the Bluetooth PAN profile more efficient. The performance of the new mechanism is investigated using a IPv6 network simulator and compared with emulated broadcasting. It is shown that the signaling overhead can...

  8. Engineering Application Way of Faults Knowledge Discovery Based on Rough Set Theory

    International Nuclear Information System (INIS)

    Zhao Rongzhen; Deng Linfeng; Li Chao

    2011-01-01

    For the knowledge acquisition puzzle of intelligence decision-making technology in mechanical industry, to use the Rough Set Theory (RST) as a kind of tool to solve the puzzle was researched. And the way to realize the knowledge discovery in engineering application is explored. A case extracting out the knowledge rules from a concise data table shows out some important information. It is that the knowledge discovery similar to the mechanical faults diagnosis is an item of complicated system engineering project. In where, first of all-important tasks is to preserve the faults knowledge into a table with data mode. And the data must be derived from the plant site and should also be as concise as possible. On the basis of the faults knowledge data obtained so, the methods and algorithms to process the data and extract the knowledge rules from them by means of RST can be processed only. The conclusion is that the faults knowledge discovery by the way is a process of rising upward. But to develop the advanced faults diagnosis technology by the way is a large-scale knowledge engineering project for long time. Every step in which should be designed seriously according to the tool's demands firstly. This is the basic guarantees to make the knowledge rules obtained have the values of engineering application and the studies have scientific significance. So, a general framework is designed for engineering application to go along the route developing the faults knowledge discovery technology.

  9. Fragment-Based Discovery of 7-Azabenzimidazoles as Potent, Highly Selective, and Orally Active CDK4/6 Inhibitors

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Young Shin; Angove, Hayley; Brain, Christopher; Chen, Christine Hiu-Tung; Cheng, Hong; Cheng, Robert; Chopra, Rajiv; Chung, Kristy; Congreve, Miles; Dagostin, Claudio; Davis, Deborah J.; Feltell, Ruth; Giraldes, John; Hiscock, Steven D.; Kim, Sunkyu; Kovats, Steven; Lagu, Bharat; Lewry, Kim; Loo, Alice; Lu, Yipin; Luzzio, Michael; Maniara, Wiesia; McMenamin, Rachel; Mortenson, Paul N.; Benning, Rajdeep; O' Reilly, Marc; Rees, David C.; Shen, Junqing; Smith, Troy; Wang, Yaping; Williams, Glyn; Woolford, Alison J. -A.; Wrona, Wojciech; Xu, Mei; Yang, Fan; Howard, Steven

    2012-06-14

    Herein, we describe the discovery of potent and highly selective inhibitors of both CDK4 and CDK6 via structure-guided optimization of a fragment-based screening hit. CDK6 X-ray crystallography and pharmacokinetic data steered efforts in identifying compound 6, which showed >1000-fold selectivity for CDK4 over CDKs 1 and 2 in an enzymatic assay. Furthermore, 6 demonstrated in vivo inhibition of pRb-phosphorylation and oral efficacy in a Jeko-1 mouse xenograft model.

  10. Use of arbitrary DNA primers, polyacrylamide gel electrophoresis and silver staining for identity testing, gene discovery and analysis of gene expression

    International Nuclear Information System (INIS)

    Gresshoff, P.

    1998-01-01

    To understand chemically-induced genomic differences in soybean mutants differing in their ability to enter the nitrogen-fixing symbiosis involving Bradyrhizobium japonicum, molecular techniques were developed to aid the map-based, or positional, cloning. DNA marker technology involving single arbitrary primers was used to enrich regional RFLP linkage data. Molecular techniques, including two-dimensional pulse field gel electrophoresis, were developed to ascertain the first physical mapping in soybean, leading to the conclusion that in the region of marker pA-36 on linkage group H, 1 cM equals about 500 cM. High molecular weight DNA was isolated and cloned into yeast or bacterial artificial chromosomes (YACs/ BACs). YACs were used to analyze soybean genome structure, revealing that over half of the genome contains repetitive DNA. Genetic and molecular tools are now available to facilitate the isolation of plant genes directly involved in symbiosis. The further characterization of these genes, along with the determination of the mechanisms that lead to the mutation, will be of value to other plants and induced mutation research. (author)

  11. Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds

    Science.gov (United States)

    Marrero-Ponce, Yovani; Meneses-Marcel, Alfredo; Rivera-Borroto, Oscar M.; García-Domenech, Ramón; De Julián-Ortiz, Jesus Vicente; Montero, Alina; Escario, José Antonio; Barrio, Alicia Gómez; Pereira, David Montero; Nogal, Juan José; Grau, Ricardo; Torrens, Francisco; Vogel, Christian; Arán, Vicente J.

    2008-08-01

    Trichomonas vaginalis ( Tv) is the causative agent of the most common, non-viral, sexually transmitted disease in women and men worldwide. Since 1959, metronidazole (MTZ) has been the drug of choice in the systemic treatment of trichomoniasis. However, resistance to MTZ in some patients and the great cost associated with the development of new trichomonacidals make necessary the development of computational methods that shorten the drug discovery pipeline. Toward this end, bond-based linear indices, new TOMOCOMD-CARDD molecular descriptors, and linear discriminant analysis were used to discover novel trichomonacidal chemicals. The obtained models, using non-stochastic and stochastic indices, are able to classify correctly 89.01% (87.50%) and 82.42% (84.38%) of the chemicals in the training (test) sets, respectively. These results validate the models for their use in the ligand-based virtual screening. In addition, they show large Matthews' correlation coefficients ( C) of 0.78 (0.71) and 0.65 (0.65) for the training (test) sets, correspondingly. The result of predictions on the 10% full-out cross-validation test also evidences the robustness of the obtained models. Later, both models are applied to the virtual screening of 12 compounds already proved against Tv. As a result, they correctly classify 10 out of 12 (83.33%) and 9 out of 12 (75.00%) of the chemicals, respectively; which is the most important criterion for validating the models. Besides, these classification functions are applied to a library of seven chemicals in order to find novel antitrichomonal agents. These compounds are synthesized and tested for in vitro activity against Tv. As a result, experimental observations approached to theoretical predictions, since it was obtained a correct classification of 85.71% (6 out of 7) of the chemicals. Moreover, out of the seven compounds that are screened, synthesized and biologically assayed, six compounds (VA7-34, VA7-35, VA7-37, VA7-38, VA7-68, VA7-70) show

  12. SPSNet: subpopulation-sensitive network-based analysis of heterogeneous gene expression data.

    Science.gov (United States)

    Belorkar, Abha; Vadigepalli, Rajanikanth; Wong, Limsoon

    2018-03-19

    Transcriptomic datasets often contain undeclared heterogeneity arising from biological variation such as diversity of disease subtypes, treatment subgroups, time-series gene expression, nested experimental conditions, as well as technical variation due to batch effects, platform differences in integrated meta-analyses, etc. However, current analysis approaches are primarily designed to handle comparisons between experimental conditions represented by homogeneous samples, thus precluding the discovery of underlying subphenotypes. Unsupervised methods for subtype identification are typically based on individual gene level analysis, which often result in irreproducible gene signatures for potential subtypes. Emerging methods to study heterogeneity have been largely developed in the context of single-cell datasets containing hundreds to thousands of samples, limiting their use to select contexts. We present a novel analysis method, SPSNet, which identifies subtype-specific gene expression signatures based on the activity of subnetworks in biological pathways. SPSNet identifies the gene subnetworks capturing the diversity of underlying biological mechanisms, indicating potential sample subphenotypes. In the presence of extrinsic or non-biological heterogeneity (e.g. batch effects), SPSNet identifies subnetworks that are particularly affected by such variation, thus helping eliminate factors irrelevant to the biology of the phenotypes under study. Using multiple publicly available datasets, we illustrate that SPSNet is able to consistently uncover patterns within gene expression data that correspond to meaningful heterogeneity of various origins. We also demonstrate the performance of SPSNet as a sensitive and reliable tool for understanding the structure and nature of such heterogeneity.

  13. Discovery and identification of quality markers of Chinese medicine based on pharmacokinetic analysis.

    Science.gov (United States)

    He, Jun; Feng, Xinchi; Wang, Kai; Liu, Changxiao; Qiu, Feng

    2018-02-28

    Quality control of Chinese medicine (CM) is an effective measure to ensure the safety and efficacy of CM in clinical practice, which is also a key factor to restrict the modernization process of CM. Various chemical components exist in CM and the determination of several chemical components is the main approach for quality control of vast majority of CM in the present. However, many components determined lack not only specificity, but also biological activities. This is bound to greatly reduce the actual value of quality standard of CM. Professor Changxiao Liu proposed the "quality marker" (Q-marker) concept to ensure the standardization and rationalization for the quality control of CM. As we all know, CMs are taken orally in most cases and could be extensively metabolized in vivo. Both prototype components and the metabolites could be the actual therapeutic material basis. Pharmacokinetic studies could benefit the elucidation of actual therapeutic material basis which is closely related to the identification of Q-markers. Therefore, a new strategy about Q-marker was proposed based on the pharmacokinetic analysis of CM, hoping to provide some ideas for the discovery and identification of Q-marker. The relationship between pharmacokinetic studies and the identification of Q-markers was demonstrated in this review and a new strategy was proposed. Starting from the pharmacokinetic analysis, reverse tracing of the prototype active components and the potential prodrugs in CM were conducted first and the therapeutic material basis were identified as Q-markers. Then, modern analytical techniques and methods were applied to obtain comprehensive quality control for these constituents. Several CMs including gingko biloba, ginseng, Periplocae Cortex, Mori Cortex, Bupleuri Radix and Scutellariae Radix were listed as examples to clarify how the new strategy could be applied. Pharmacokinetic studies play an important role for the elucidation of therapeutic material basis of CM

  14. Rapid Countermeasure Discovery against Francisella tularensis Based on a Metabolic Network Reconstruction

    Science.gov (United States)

    Chaudhury, Sidhartha; Abdulhameed, Mohamed Diwan M.; Singh, Narender; Tawa, Gregory J.; D’haeseleer, Patrik M.; Zemla, Adam T.; Navid, Ali; Zhou, Carol E.; Franklin, Matthew C.; Cheung, Jonah; Rudolph, Michael J.; Love, James; Graf, John F.; Rozak, David A.; Dankmeyer, Jennifer L.; Amemiya, Kei; Daefler, Simon; Wallqvist, Anders

    2013-01-01

    In the future, we may be faced with the need to provide treatment for an emergent biological threat against which existing vaccines and drugs have limited efficacy or availability. To prepare for this eventuality, our objective was to use a metabolic network-based approach to rapidly identify potential drug targets and prospectively screen and validate novel small-molecule antimicrobials. Our target organism was the fully virulent Francisella tularensis subspecies tularensis Schu S4 strain, a highly infectious intracellular pathogen that is the causative agent of tularemia and is classified as a category A biological agent by the Centers for Disease Control and Prevention. We proceeded with a staggered computational and experimental workflow that used a strain-specific metabolic network model, homology modeling and X-ray crystallography of protein targets, and ligand- and structure-based drug design. Selected compounds were subsequently filtered based on physiological-based pharmacokinetic modeling, and we selected a final set of 40 compounds for experimental validation of antimicrobial activity. We began screening these compounds in whole bacterial cell-based assays in biosafety level 3 facilities in the 20th week of the study and completed the screens within 12 weeks. Six compounds showed significant growth inhibition of F. tularensis, and we determined their respective minimum inhibitory concentrations and mammalian cell cytotoxicities. The most promising compound had a low molecular weight, was non-toxic, and abolished bacterial growth at 13 µM, with putative activity against pantetheine-phosphate adenylyltransferase, an enzyme involved in the biosynthesis of coenzyme A, encoded by gene coaD. The novel antimicrobial compounds identified in this study serve as starting points for lead optimization, animal testing, and drug development against tularemia. Our integrated in silico/in vitro approach had an overall 15% success rate in terms of active versus tested

  15. QTL mapping and candidate gene discovery in potato for resistance to the Verticillium wilt pathogen Verticillium dahliae

    Science.gov (United States)

    Verticillium wilt (VW) of potato (Solanum tuberosum), caused by fungal pathogens, Verticillium dahliae and V. albo atrum, is a disease of major significance throughout the potato growing regions in the world. In the past, researchers have focused on the Ve gene, which is a major dominant gene that c...

  16. From mutation identification to therapy: discovery and origins of the first approved gene therapy in the Western world

    NARCIS (Netherlands)

    Kastelein, John J. P.; Ross, Colin J. D.; Hayden, Michael R.

    2013-01-01

    On November 2, 2012, Glybera® (alipogene tipovarvec) was the first human gene therapy to receive long awaited market approval in the Western world. This important milestone is expected to open the door to additional gene therapies for the treatment of many diseases in the future. The development of

  17. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    Energy Technology Data Exchange (ETDEWEB)

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  18. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    Directory of Open Access Journals (Sweden)

    Victor Zeng

    Full Text Available Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects, representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket, a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in

  19. Developmental Gene Discovery in a Hemimetabolous Insect: De Novo Assembly and Annotation of a Transcriptome for the Cricket Gryllus bimaculatus

    Science.gov (United States)

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID

  20. A Prerecognition Model for Hot Topic Discovery Based on Microblogging Data

    Science.gov (United States)

    Zhu, Tongyu

    2014-01-01

    The microblogging is prevailing since its easy and anonymous information sharing at Internet, which also brings the issue of dispersing negative topics, or even rumors. Many researchers have focused on how to find and trace emerging topics for analysis. When adopting topic detection and tracking techniques to find hot topics with streamed microblogging data, it will meet obstacles like streamed microblogging data clustering, topic hotness definition, and emerging hot topic discovery. This paper schemes a novel prerecognition model for hot topic discovery. In this model, the concepts of the topic life cycle, the hot velocity, and the hot acceleration are promoted to calculate the change of topic hotness, which aims to discover those emerging hot topics before they boost and break out. Our experiments show that this new model would help to discover potential hot topics efficiently and achieve considerable performance. PMID:25254235

  1. A Prerecognition Model for Hot Topic Discovery Based on Microblogging Data

    OpenAIRE

    Zhu, Tongyu; Yu, Jianjun

    2014-01-01

    The microblogging is prevailing since its easy and anonymous information sharing at Internet, which also brings the issue of dispersing negative topics, or even rumors. Many researchers have focused on how to find and trace emerging topics for analysis. When adopting topic detection and tracking techniques to find hot topics with streamed microblogging data, it will meet obstacles like streamed microblogging data clustering, topic hotness definition, and emerging hot topic discovery. This pap...

  2. Analyzing Properties of Service Discovery Protocols Using an Architecture-Based Approach (Briefing Charts)

    Science.gov (United States)

    2001-12-12

    8 Sample Network Topology Applicable to Jini Entities Lazy Discovery Multicast Group Service Manager (SM) Service User (SU) Service Cache Manager...Manager() 0..1 Contains 1 Contains SERVICE MANAGER discov er Network Context() <<not shr>> Cache Manager Discov ery () <<OPT>> Announce Serv ice Processing...availability requests Service Manager Service Cache Manager Service User Service Description Service Provider Service Repository Service Cache

  3. Statistical design for biospecimen cohort size in proteomics-based biomarker discovery and verification studies.

    Science.gov (United States)

    Skates, Steven J; Gillette, Michael A; LaBaer, Joshua; Carr, Steven A; Anderson, Leigh; Liebler, Daniel C; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R; Rodriguez, Henry; Boja, Emily S

    2013-12-06

    Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor, and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC) with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step toward building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research.

  4. A comparison of digital gene expression profiling and methyl DNA immunoprecipitation as methods for gene discovery in honeybee (Apis mellifera behavioural genomic analyses.

    Directory of Open Access Journals (Sweden)

    Cui Guan

    Full Text Available The honey bee has a well-organized system of division of labour among workers. Workers typically progress through a series of discrete behavioural castes as they age, and this has become an important case study for exploring how dynamic changes in gene expression can influence behaviour. Here we applied both digital gene expression analysis and methyl DNA immunoprecipitation analysis to nurse, forager and reverted nurse bees (nurses that have returned to the nursing state after a period spent foraging from the same colony in order to compare the outcomes of these different forms of genomic analysis. A total of 874 and 710 significantly differentially expressed genes were identified in forager/nurse and reverted nurse/forager comparisons respectively. Of these, 229 genes exhibited reversed directions of gene expression differences between the forager/nurse and reverted nurse/forager comparisons. Using methyl-DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq we identified 366 and 442 significantly differentially methylated genes in forager/nurse and reverted nurse/forager comparisons respectively. Of these, 165 genes were identified as differentially methylated in both comparisons. However, very few genes were identified as both differentially expressed and differentially methylated in our comparisons of nurses and foragers. These findings confirm that changes in both gene expression and DNA methylation are involved in the nurse and forager behavioural castes, but the different analytical methods reveal quite distinct sets of candidate genes.

  5. Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

    Science.gov (United States)

    Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R

    2016-05-27

    Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and

  6. Gene and enhancer trap tagging of vascular-expressed genes in poplar trees

    Science.gov (United States)

    Andrew Groover; Joseph R. Fontana; Gayle Dupper; Caiping Ma; Robert Martienssen; Steven Strauss; Richard Meilan

    2004-01-01

    We report a gene discovery system for poplar trees based on gene and enhancer traps. Gene and enhancer trap vectors carrying the β-glucuronidase (GUS) reporter gene were inserted into the poplar genome via Agrobacterium tumefaciens transformation, where they reveal the expression pattern of genes at or near the insertion sites. Because GUS...

  7. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  8. Correlation-based linear discriminant classification for gene expression data.

    Science.gov (United States)

    Pan, M; Zhang, J

    2017-01-23

    Microarray gene expression technology provides a systematic approach to patient classification. However, microarray data pose a great computational challenge owing to their large dimensionality, small sample sizes, and potential correlations among genes. A recent study has shown that gene-gene correlations have a positive effect on the accuracy of classification models, in contrast to some previous results. In this study, a recently developed correlation-based classifier, the ensemble of random subspace (RS) Fisher linear discriminants (FLDs), was utilized. The impact of gene-gene correlations on the performance of this classifier and other classifiers was studied using simulated datasets and real datasets. A cross-validation framework was used to evaluate the performance of each classifier using the simulated datasets or real datasets, and misclassification rates (MRs) were computed. Using the simulated data, the average MRs of the correlation-based classifiers decreased as the correlations increased when there were more correlated genes. Using real data, the correlation-based classifiers outperformed the non-correlation-based classifiers, especially when the gene-gene correlations were high. The ensemble RS-FLD classifier is a potential state-of-the-art computational method. The correlation-based ensemble RS-FLD classifier was effective and benefited from gene-gene correlations, particularly when the correlations were high.

  9. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  10. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Ju-Chun Hsu

    Full Text Available Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS. The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs. A total of 29,067 isotigs have putative homologues in the non-redundant (nr protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also

  11. HANDS: a tool for genome-wide discovery of subgenome-specific base-identity in polyploids.

    KAUST Repository

    Mithani, Aziz

    2013-09-24

    The analysis of polyploid genomes is problematic because homeologous subgenome sequences are closely related. This relatedness makes it difficult to assign individual sequences to the specific subgenome from which they are derived, and hinders the development of polyploid whole genome assemblies.We here present a next-generation sequencing (NGS)-based approach for assignment of subgenome-specific base-identity at sites containing homeolog-specific polymorphisms (HSPs): \\'HSP base Assignment using NGS data through Diploid Similarity\\' (HANDS). We show that HANDS correctly predicts subgenome-specific base-identity at >90% of assayed HSPs in the hexaploid bread wheat (Triticum aestivum) transcriptome, thus providing a substantial increase in accuracy versus previous methods for homeolog-specific base assignment.We conclude that HANDS enables rapid and accurate genome-wide discovery of homeolog-specific base-identity, a capability having multiple applications in polyploid genomics.

  12. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2017-01-01

    In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well as the...

  13. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  14. Tumour class prediction and discovery by microarray-based DNA methylation analysis

    Science.gov (United States)

    Adorján, Péter; Distler, Jürgen; Lipscher, Evelyne; Model, Fabian; Müller, Jürgen; Pelet, Cécile; Braun, Aron; Florl, Andrea R.; Gütig, David; Grabs, Gabi; Howe, André; Kursar, Mischo; Lesche, Ralf; Leu, Erik; Lewin, André; Maier, Sabine; Müller, Volker; Otto, Thomas; Scholz, Christian; Schulz, Wolfgang A.; Seifert, Hans-Helge; Schwope, Ina; Ziebarth, Heike; Berlin, Kurt; Piepenbrock, Christian; Olek, Alexander

    2002-01-01

    Aberrant DNA methylation of CpG sites is among the earliest and most frequent alterations in cancer. Several studies suggest that aberrant methylation occurs in a tumour type-specific manner. However, large-scale analysis of candidate genes has so far been hampered by the lack of high throughput assays for methylation detection. We have developed the first microarray-based technique which allows genome-wide assessment of selected CpG dinucleotides as well as quantification of methylation at each site. Several hundred CpG sites were screened in 76 samples from four different human tumour types and corresponding healthy controls. Discriminative CpG dinucleotides were identified for different tissue type distinctions and used to predict the tumour class of as yet unknown samples with high accuracy using machine learning techniques. Some CpG dinucleotides correlate with progression to malignancy, whereas others are methylated in a tissue-specific manner independent of malignancy. Our results demonstrate that genome-wide analysis of methylation patterns combined with supervised and unsupervised machine learning techniques constitute a powerful novel tool to classify human cancers. PMID:11861926

  15. A roadmap for natural product discovery based on large-scale genomics and metabolomics

    Science.gov (United States)

    Actinobacteria encode a wealth of natural product biosynthetic gene clusters, whose systematic study is complicated by numerous repetitive motifs. By combining several metrics we developed a method for global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic ca...

  16. Genome-based discovery, structure prediction and functional analysis of cyclic lipopeptide antibiotics in Pseudomonas species

    NARCIS (Netherlands)

    Bruijn, de I.; Kock, de M.J.D.; Meng, Y.; Waard, de P.; Beek, van T.A.; Raaijmakers, J.M.

    2007-01-01

    Analysis of microbial genome sequences have revealed numerous genes involved in antibiotic biosynthesis. In Pseudomonads, several gene clusters encoding non-ribosomal peptide synthetases (NRPSs) were predicted to be involved in the synthesis of cyclic lipopeptide (CLP) antibiotics. Most of these

  17. Variable importance analysis based on rank aggregation with applications in metabolomics for biomarker discovery.

    Science.gov (United States)

    Yun, Yong-Huan; Deng, Bai-Chuan; Cao, Dong-Sheng; Wang, Wei-Ting; Liang, Yi-Zeng

    2016-03-10

    Biomarker discovery is one important goal in metabolomics, which is typically modeled as selecting the most discriminating metabolites for classification and often referred to as variable importance analysis or variable selection. Until now, a number of variable importance analysis methods to discover biomarkers in the metabolomics studies have been proposed. However, different methods are mostly likely to generate different variable ranking results due to their different principles. Each method generates a variable ranking list just as an expert presents an opinion. The problem of inconsistency between different variable ranking methods is often ignored. To address this problem, a simple and ideal solution is that every ranking should be taken into account. In this study, a strategy, called rank aggregation, was employed. It is an indispensable tool for merging individual ranking lists into a single "super"-list reflective of the overall preference or importance within the population. This "super"-list is regarded as the final ranking for biomarker discovery. Finally, it was used for biomarkers discovery and selecting the best variable subset with the highest predictive classification accuracy. Nine methods were used, including three univariate filtering and six multivariate methods. When applied to two metabolic datasets (Childhood overweight dataset and Tubulointerstitial lesions dataset), the results show that the performance of rank aggregation has improved greatly with higher prediction accuracy compared with using all variables. Moreover, it is also better than penalized method, least absolute shrinkage and selectionator operator (LASSO), with higher prediction accuracy or less number of selected variables which are more interpretable. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Comprehensive mass spectrometry based biomarker discovery and validation platform as applied to diabetic kidney disease

    Directory of Open Access Journals (Sweden)

    Scott D. Bringans

    2017-03-01

    Full Text Available A protein biomarker discovery workflow was applied to plasma samples from patients at different stages of diabetic kidney disease. The proteomics platform produced a panel of significant plasma biomarkers that were statistically scrutinised against the current gold standard tests on an analysis of 572 patients. Five proteins were significantly associated with diabetic kidney disease defined by albuminuria, renal impairment (eGFR and chronic kidney disease staging (CKD Stage ≥1, ROC curve of 0.77. The results prove the suitability and efficacy of the process used, and introduce a biomarker panel with the potential to improve diagnosis of diabetic kidney disease.

  19. A Population of Deletion Mutants and an Integrated Mapping and Exome-seq Pipeline for Gene Discovery in Maize

    Science.gov (United States)

    Jia, Shangang; Li, Aixia; Morton, Kyla; Avoles-Kianian, Penny; Kianian, Shahryar F.; Zhang, Chi; Holding, David

    2016-01-01

    To better understand maize endosperm filling and maturation, we used γ-irradiation of the B73 maize reference line to generate mutants with opaque endosperm and reduced kernel fill phenotypes, and created a population of 1788 lines including 39 Mo17 × F2s showing stable, segregating, and viable kernel phenotypes. For molecular characterization of the mutants, we developed a novel functional genomics platform that combined bulked segregant RNA and exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. To exemplify the utility of the mutants and provide proof-of-concept for the bioinformatics platform, we present detailed characterization of line 937, an opaque mutant harboring a 6203 bp in-frame deletion covering six exons within the Opaque-1 gene. In addition, we describe mutant line 146 which contains a 4.8 kb intragene deletion within the Sugary-1 gene and line 916 in which an 8.6 kb deletion knocks out a Cyclin A2 gene. The publically available algorithm developed in this work improves the identification of causative deletions and its corresponding gaps within mapping peaks. This study demonstrates the utility of γ-irradiation for forward genetics in large nondense genomes such as maize since deletions often affect single genes. Furthermore, we show how this classical mutagenesis method becomes applicable for functional genomics when combined with state-of-the-art genomics tools. PMID:27261000

  20. Perbandingan Model Pembelajaran Problem Based Learning Dengan Guided Discovery Learning Terhadap Keaktifan Siswa Kelas X SMA Negeri 1 Ngawi Tahun Pelajaran 2013.2014

    OpenAIRE

    Nurmasni, harlita harlita; Pranoto, Pranoto; Santosa, Slamet

    2017-01-01

    - The purpose of this research is to know the comparison of problem based learning models with Guided discovery learning to activities of students of X grade students at SMA Negeri 1 Ngawi in academic year 2013/2014. This study was a quasi experimental research. The research design was used post-test only nonequivalent control group design. This research applied problem based learning models and Guided Discovery Learning at experimental group. The population of this research was all of X grad...

  1. Identification of genes highly downregulated in pancreatic cancer through a meta-analysis of microarray datasets: implications for discovery of novel tumor-suppressor genes and therapeutic targets.

    Science.gov (United States)

    Goonesekere, Nalin C W; Andersen, Wyatt; Smith, Alex; Wang, Xiaosheng

    2018-02-01

    The lack of specific symptoms at early tumor stages, together with a high biological aggressiveness of the tumor contribute to the high mortality rate for pancreatic cancer (PC), which has a 5-year survival rate of about 7%. Recent failures of targeted therapies inhibiting kinase activity in clinical trials have highlighted the need for new approaches towards combating this deadly disease. In this study, we have identified genes that are significantly downregulated in PC, through a meta-analysis of large number of microarray datasets. We have used qRT-PCR to confirm the downregulation of selected genes in a panel of PC cell lines. This study has yielded several novel candidate tumor-suppressor genes (TSGs) including GNMT, CEL, PLA2G1B and SERPINI2. We highlight the role of GNMT, a methyl transferase associated with the methylation potential of the cell, and CEL, a lipase, as potential therapeutic targets. We have uncovered genetic links to risk factors associated with PC such as smoking and obesity. Genes important for patient survival and prognosis are also discussed, and we confirm the dysregulation of metabolic pathways previously observed in PC. While many of the genes downregulated in our dataset are associated with protein products normally produced by the pancreas for excretion, we have uncovered some genes whose downregulation appear to play a more causal role in PC. These genes will assist in providing a better understanding of the disease etiology of PC, and in the search for new therapeutic targets and biomarkers.

  2. Successes and future outlook for microfluidics-based cardiovascular drug discovery.

    Science.gov (United States)

    Skommer, Joanna; Wlodkowic, Donald

    2015-03-01

    The greatest advantage of using microfluidics as a platform for the assessment of cardiovascular drug action is its ability to finely regulate fluid flow conditions, including flow rate, shear stress and pulsatile flow. At the same time, microfluidics provide means for modifying the vessel geometry (bifurcations, stenoses, complex networks), the type of surface of the vessel walls, and for patterning cells in 3D tissue-like architecture, including generation of lumen walls lined with cells and heart-on-a-chip structures for mimicking ventricular cardiomyocyte physiology. In addition, owing to the small volume of required specimens, microfluidics is ideally suited to clinical situations whereby monitoring of drug dosing or efficacy needs to be coupled with minimal phlebotomy-related drug loss. In this review, the authors highlight potential applications for the currently existing and emerging technologies and offer several suggestions on how to close the development cycle of microfluidic devices for cardiovascular drug discovery. The ultimate goal in microfluidics research for drug discovery is to develop 'human-on-a-chip' systems, whereby several organ cultures, including the vasculature and the heart, can mimic complex interactions between the organs and body systems. This would provide in vivo-like pharmacokinetics and pharmacodynamics for drug ADMET assessment. At present, however, the great variety of available designs does not go hand in hand with their use by the pharmaceutical community.

  3. Discovery of novel class 1 phosphatidylinositide 3-kinases (PI3K) fragment inhibitors through structure-based virtual screening.

    Science.gov (United States)

    Giordanetto, Fabrizio; Kull, Bengt; Dellsén, Anita

    2011-01-15

    The discovery of ligand efficient and lipophilicity efficient fragment inhibitors of class 1 phosphatidylinositide 3-kinases (PI3K) is reported. A fragment version of the AstraZeneca compound bank was docked to a homology model of the PI3K p110β isoform. Interaction-based scoring of the predicted binding poses served to further prioritise the virtual fragment hits. Experimental screening confirmed potency for a total of 18 fragment inhibitors, belonging to five different structural classes. Copyright © 2010 Elsevier Ltd. All rights reserved.

  4. Discovery, Annotation, and Functional Analysis of Long Noncoding RNAs Controlling Cell Cycle Gene Expression and Proliferation in Breast Cancer Cells

    Science.gov (United States)

    Sun, Miao; Gadad, Shrikanth S.; Kim, Dae-Seok; Kraus, W. Lee

    2015-01-01

    SUMMARY We describe a computational approach that integrates GRO-seq and RNA-seq data to annotate long noncoding RNAs (lncRNAs), with increased sensitivity for low abundance lncRNAs. We used this approach to characterize the lncRNA transcriptome in MCF-7 human breast cancer cells, including >700 previously unannotated lncRNAs. We then used information about the (1) transcription of lncRNA genes from GRO-seq, (2) steady-state levels of lncRNA transcripts in cell lines and patient samples from RNA-seq, and (3) histone modifications and factor binding at lncRNA gene promoters from ChIP-seq to explore lncRNA gene structure and regulation, as well as lncRNA transcript stability, regulation, and function. Functional analysis of selected lncRNAs with altered expression in breast cancers revealed roles in cell proliferation, regulation of an E2F-dependent cell cycle gene expression program, and estrogen-dependent mitogenic growth. Collectively, our studies demonstrate the use of an integrated genomic and molecular approach to identify and characterize growth-regulating lncRNAs in cancers. PMID:26236012

  5. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    Directory of Open Access Journals (Sweden)

    Karla A Salazar

    Full Text Available In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  6. Large-scale gene discovery in the Septoria tritici blotch fungus Mycosphaerella graminicola with a focus on in planta expression

    NARCIS (Netherlands)

    Kema, G.H.J.; Lee, van der T.A.J.; Mendes, O.; Verstappen, E.C.P.; Klein Lankhorst, R.M.; Sandbrink, H.; Burgt, van der A.; Zwiers, L.H.; Csukai, M.; Waalwijk, C.

    2008-01-01

    The foliar disease septoria tritici blotch, caused by the fungus Mycosphaerella graminicola, is currently the most important wheat disease in Europe. Gene expression was examined under highly different conditions, using 10 expressed sequence tag libraries generated from M. graminicola isolate IPO323

  7. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification.

    Science.gov (United States)

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets.

  8. Generation of expressed sequence tags under cadmium stress for gene discovery and development of molecular markers in chickpea.

    Science.gov (United States)

    Gaur, Rashmi; Bhatia, Sabhyata; Gupta, Meetu

    2014-07-01

    Chickpea is the world's third most important legume crop and belongs to Fabaceae family but suffered from severe yield loss due to various biotic and abiotic stresses. Development of modern genomic tools such as molecular markers and identification of resistant genes associated with these stresses facilitate improvement in chickpea breeding towards abiotic stress tolerance. In this study, 1597 high-quality expressed sequence tags (ESTs) were generated from a cDNA library of variety Pusa 1105 root tissue after cadmium (Cd) treatment. Assembly of ESTs resulted in a total of 914 unigenes of which putative homology was obtained for 38.8 % of unigenes after BLASTX search. In terms of species distribution, majority of sequences found similarity with Medicago truncatula followed by Glycine max, Vitis vinifera and Populus trichocarpa and Pisum sativum sequences. Functional annotation was assigned using Blast2Go, and the Gene Ontology (GO) terms were categorized into biological process, molecular function and cellular component. Approximately 10.83 % of unigenes were assigned at least one GO term. Moreover, in the distribution of transcripts into various biological pathways, 20 of the annotated transcripts were assigned to ten pathways in KEGG database. A majority of the genes were found to be involved in sulphur and nitrogen metabolism. In the quantitative real-time PCR analysis, five of the transcription factors and three of the transporter genes were found to be highly expressed after Cd treatment. Besides, the utility of ESTs was demonstrated by exploiting them for the development of 83 genic molecular markers including EST-simple sequence repeats and intron targeted polymorphism that would assist in tagging of genes related to metal stress for future prospects.

  9. Mass spectrometry based biomarker discovery, verification, and validation--quality assurance and control of protein biomarker assays.

    Science.gov (United States)

    Parker, Carol E; Borchers, Christoph H

    2014-06-01

    In its early years, mass spectrometry (MS)-based proteomics focused on the cataloging of proteins found in different species or different tissues. By 2005, proteomics was being used for protein quantitation, typically based on "proteotypic" peptides which act as surrogates for the parent proteins. Biomarker discovery is usually done by non-targeted "shotgun" proteomics, using relative quantitation methods to determine protein expression changes that correlate with disease (output given as "up-or-down regulation" or "fold-increases"). MS-based techniques can also perform "absolute" quantitation which is required for clinical applications (output given as protein concentrations). Here we describe the differences between these methods, factors that affect the precision and accuracy of the results, and some examples of recent studies using MS-based proteomics to verify cancer-related biomarkers. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  10. Climate Discovery: Integrating Research With Exhibit, Public Tours, K-12, and Web-based EPO Resources

    Science.gov (United States)

    Foster, S. Q.; Carbone, L.; Gardiner, L.; Johnson, R.; Russell, R.; Advisory Committee, S.; Ammann, C.; Lu, G.; Richmond, A.; Maute, A.; Haller, D.; Conery, C.; Bintner, G.

    2005-12-01

    The Climate Discovery Exhibit at the National Center for Atmospheric Research (NCAR) Mesa Lab provides an exciting conceptual outline for the integration of several EPO activities with other well-established NCAR educational resources and programs. The exhibit is organized into four topic areas intended to build understanding among NCAR's 80,000 annual visitors, including 10,000 school children, about Earth system processes and scientific methods contributing to a growing body of knowledge about climate and global change. These topics include: 'Sun-Earth Connections,' 'Climate Now,' 'Climate Past,' and 'Climate Future.' Exhibit text, graphics, film and electronic media, and interactives are developed and updated through collaborations between NCAR's climate research scientists and staff in the Office of Education and Outreach (EO) at the University Corporation for Atmospheric Research (UCAR). With funding from NCAR, paleoclimatologists have contributed data and ideas for a new exhibit Teachers' Guide unit about 'Climate Past.' This collection of middle-school level, standards-aligned lessons are intended to help students gain understanding about how scientists use proxy data and direct observations to describe past climates. Two NASA EPO's have funded the development of 'Sun-Earth Connection' lessons, visual media, and tips for scientists and teachers. Integrated with related content and activities from the NASA-funded Windows to the Universe web site, these products have been adapted to form a second unit in the Climate Discovery Teachers' Guide about the Sun's influence on Earth's climate. Other lesson plans, previously developed by on-going efforts of EO staff and NSF's previously-funded Project Learn program are providing content for a third Teachers' Guide unit on 'Climate Now' - the dynamic atmospheric and geological processes that regulate Earth's climate. EO has plans to collaborate with NCAR climatologists and computer modelers in the next year to develop

  11. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens

    Science.gov (United States)

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J.; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Vecchio, Lucia Del; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J.; Novak, Jan; Julian, Bruce A.; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P.; Gharavi, Ali G.

    2014-01-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six novel genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geo-spatial distribution of risk alleles is highly suggestive of multi-locus adaptation and the genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  12. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    Directory of Open Access Journals (Sweden)

    Ettore Mosca

    2017-09-01

    Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.

  13. Comprehensive Phenotyping in Multiple Sclerosis: Discovery Based Proteomics and the Current Understanding of Putative Biomarkers

    Directory of Open Access Journals (Sweden)

    Kevin C. O’Connor

    2006-01-01

    Full Text Available Currently, there is no single test for multiple sclerosis (MS. Diagnosis is confirmed through clinical evaluation, abnormalities revealed by magnetic resonance imaging (MRI, and analysis of cerebrospinal fluid (CSF chemistry. The early and accurate diagnosis of the disease, monitoring of progression, and gauging of therapeutic intervention are important but elusive elements of patient care. Moreover, a deeper understanding of the disease pathology is needed, including discovery of accurate biomarkers for MS. Herein we review putative biomarkers of MS relating to neurodegeneration and contributions to neuropathology, with particular focus on autoimmunity. In addition, novel assessments of biomarkers not driven by hypotheses are discussed, featuring our application of advanced proteomics and metabolomics for comprehensive phenotyping of CSF and blood. This strategy allows comparison of component expression levels in CSF and serum between MS and control groups. Examination of these preliminary data suggests that several CSF proteins in MS are differentially expressed, and thus, represent putative biomarkers deserving of further evaluation.

  14. A Fluorescence Displacement Assay for Antidepressant Drug Discovery Based on Ligand-Conjugated Quantum Dots

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Jerry [Vanderbilt University; Tomlinson, Ian [Oak Ridge National Laboratory (ORNL); Warnement, Michael [Vanderbilt University; Iwamoto, Hideki [Vanderbilt University

    2011-01-01

    The serotonin (5-hydroxytryptamine, 5-HT) transporter (SERT) protein plays a central role in terminating 5-HT neurotransmission and is the most important therapeutic target for the treatment of major depression and anxiety disorders. We report an innovative, versatile, and target-selective quantum dot (QD) labeling approach for SERT in single Xenopus oocytes that can be adopted as a drug-screening platform. Our labeling approach employs a custom-made, QD-tagged indoleamine derivative ligand, IDT318, that is structurally similar to 5-HT and accesses the primary binding site with enhanced human SERT selectivity. Incubating QD-labeled oocytes with paroxetine (Paxil), a high-affinity SERT-specific inhibitor, showed a concentration- and time-dependent decrease in QD fluorescence, demonstrating the utility of our approach for the identification of SERT modulators. Furthermore, with the development of ligands aimed at other pharmacologically relevant targets, our approach may potentially form the basis for a multitarget drug discovery platform.

  15. SpirPep: an in silico digestion-based platform to assist bioactive peptides discovery from a genome-wide database.

    Science.gov (United States)

    Anekthanakul, Krittima; Hongsthong, Apiradee; Senachak, Jittisak; Ruengjitchatchawalya, Marasri

    2018-04-20

    Bioactive peptides, including biological sources-derived peptides with different biological activities, are protein fragments that influence the functions or conditions of organisms, in particular humans and animals. Conventional methods of identifying bioactive peptides are time-consuming and costly. To quicken the processes, several bioinformatics tools are recently used to facilitate screening of the potential peptides prior their activity assessment in vitro and/or in vivo. In this study, we developed an efficient computational method, SpirPep, which offers many advantages over the currently available tools. The SpirPep web application tool is a one-stop analysis and visualization facility to assist bioactive peptide discovery. The tool is equipped with 15 customized enzymes and 1-3 miscleavage options, which allows in silico digestion of protein sequences encoded by protein-coding genes from single, multiple, or genome-wide scaling, and then directly classifies the peptides by bioactivity using an in-house database that contains bioactive peptides collected from 13 public databases. With this tool, the resulting peptides are categorized by each selected enzyme, and shown in a tabular format where the peptide sequences can be tracked back to their original proteins. The developed tool and webpages are coded in PHP and HTML with CSS/JavaScript. Moreover, the tool allows protein-peptide alignment visualization by Generic Genome Browser (GBrowse) to display the region and details of the proteins and peptides within each parameter, while considering digestion design for the desirable bioactivity. SpirPep is efficient; it takes less than 20 min to digest 3000 proteins (751,860 amino acids) with 15 enzymes and three miscleavages for each enzyme, and only a few seconds for single enzyme digestion. Obviously, the tool identified more bioactive peptides than that of the benchmarked tool; an example of validated pentapeptide (FLPIL) from LC-MS/MS was demonstrated. The

  16. Consistent discovery of frequent interval-based temporal patterns in chronic patients' data.

    Science.gov (United States)

    Shknevsky, Alexander; Shahar, Yuval; Moskovitch, Robert

    2017-11-01

    Increasingly, frequent temporal patterns discovered in longitudinal patient records are proposed as features for classification and prediction, and as means to cluster patient clinical trajectories. However, to justify that, we must demonstrate that most frequent temporal patterns are indeed consistently discoverable within the records of different patient subsets within similar patient populations. We have developed several measures for the consistency of the discovery of temporal patterns. We focus on time-interval relations patterns (TIRPs) that can be discovered within different subsets of the same patient population. We expect the discovered TIRPs (1) to be frequent in each subset, (2) preserve their "local" metrics - the absolute frequency of each pattern, measured by a Proportion Test, and (3) preserve their "global" characteristics - their overall distribution, measured by a Kolmogorov-Smirnov test. We also wanted to examine the effect on consistency, over a variety of settings, of varying the minimal frequency threshold for TIRP discovery, and of using a TIRP-filtering criterion that we previously introduced, the Semantic Adjacency Criterion (SAC). We applied our methodology to three medical domains (oncology, infectious hepatitis, and diabetes). We found that, within the minimal frequency ranges we had examined, 70-95% of the discovered TIRPs were consistently discoverable; 40-48% of them maintained their local frequency. TIRP global distribution similarity varied widely, from 0% to 65%. Increasing the threshold usually increased the percentage of TIRPs that were repeatedly discovered across different patient subsets within the same domain, and the probability of a similar TIRP distribution. Using the SAC principle, enhanced, for most minimal support levels, the percentage of repeating TIRPs, their local consistency and their global consistency. The effect of using the SAC was further strengthened as the minimal frequency threshold was raised. Copyright

  17. Statistical interpretation of machine learning-based feature importance scores for biomarker discovery.

    Science.gov (United States)

    Huynh-Thu, Vân Anh; Saeys, Yvan; Wehenkel, Louis; Geurts, Pierre

    2012-07-01

    Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. Python source codes of all tested methods, as well as the MATLAB scripts used for data simulation, can be found in the Supplementary Material.

  18. Gene Discovery and Advances in Finger Millet [Eleusine coracana (L.) Gaertn.] Genomics-An Important Nutri-Cereal of Future.

    Science.gov (United States)

    Sood, Salej; Kumar, Anil; Babu, B Kalyana; Gaur, Vikram S; Pandey, Dinesh; Kant, Lakshmi; Pattnayak, Arunava

    2016-01-01

    The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat) of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [ Eleusine coracana (L.) Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologs from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2-4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet.

  19. Gene Discovery and Advances in Finger millet [Eleusine coracana (L. Gaertn.] Genomics - An Important Nutri-cereal of Future

    Directory of Open Access Journals (Sweden)

    Salej Sood

    2016-11-01

    Full Text Available The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [Eleusine coracana (L. Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologues from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2-4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet.

  20. Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and ChIP-chip data

    Directory of Open Access Journals (Sweden)

    Sivaganesan Siva

    2007-08-01

    Full Text Available Abstract Background Transcriptional modules (TM consist of groups of co-regulated genes and transcription factors (TF regulating their expression. Two high-throughput (HT experimental technologies, gene expression microarrays and Chromatin Immuno-Precipitation on Chip (ChIP-chip, are capable of producing data informative about expression regulatory mechanism on a genome scale. The optimal approach to joint modeling of data generated by these two complementary biological assays, with the goal of identifying and characterizing TMs, is an important open problem in computational biomedicine. Results We developed and validated a novel probabilistic model and related computational procedure for identifying TMs by jointly modeling gene expression and ChIP-chip binding data. We demonstrate an improved functional coherence of the TMs produced by the new method when compared to either analyzing expression or ChIP-chip data separately or to alternative approaches for joint analysis. We also demonstrate the ability of the new algorithm to identify novel regulatory relationships not revealed by ChIP-chip data alone. The new computational procedure can be used in more or less the same way as one would use simple hierarchical clustering without performing any special transformation of data prior to the analysis. The R and C-source code for implementing our algorithm is incorporated within the R package gimmR which is freely available at http://eh3.uc.edu/gimm. Conclusion Our results indicate that, whenever available, ChIP-chip and expression data should be analyzed within the unified probabilistic modeling framework, which will likely result in improved clusters of co-regulated genes and improved ability to detect meaningful regulatory relationships. Given the good statistical properties and the ease of use, the new computational procedure offers a worthy new tool for reconstructing transcriptional regulatory networks.

  1. Using gene expression databases for classical trait QTL candidate gene discovery in the BXD recombinant inbred genetic reference population: Mouse forebrain weight

    Directory of Open Access Journals (Sweden)

    Zhou Jianhua

    2008-09-01

    Full Text Available Abstract Background Successful strategies for QTL gene identification benefit from combined experimental and bioinformatic approaches. Unique design aspects of the BXD recombinant inbred line mapping panel allow use of archived gene microarray expression data to filter likely from unlikely candidates. This prompted us to propose a simple five-filter protocol for candidate nomination. To filter more likely from less likely candidates, we required candidate genes near to the QTL to have mRNA abundance that correlated with the phenotype among the BXD lines as well as differed between the parental lines C57BL/6J and DBA/2J. We also required verification of mRNA abundance by an independent method, and finally we required either differences in protein levels or confirmed DNA sequence differences. Results QTL mapping of mouse forebrain weight in 34 BXD RI lines found significant association on chromosomes 1 and 11, with each C57BL/6J allele increasing weight by more than half a standard deviation. The intersection of gene lists that were within ± 10 Mb of the strongest associated location, that had forebrain mRNA abundance correlated with forebrain weight among the BXD, and that had forebrain mRNA abundance differing between C57BL/6J and DBA/2J, produced two candidates, Tnni1 (troponin 1 and Asb3 (ankyrin repeat and SOCS box-containing protein 3. Quantitative RT-PCR confirmed the direction of an increased expression in C57BL/6J genotype over the DBA/2J genotype for both genes, a difference that translated to a 2-fold difference in Asb3 protein. Although Tnni1 protein differences could not be confirmed, a 273 bp indel polymorphism was discovered 1 Kb upstream of the transcription start site. Conclusion Delivery of well supported candidate genes following a single quantitative trait locus mapping experiment is difficult. However, by combining available gene expression data with QTL mapping, we illustrated a five-filter protocol that nominated Asb3 and

  2. Transcriptome-Based Discovery of Fusarium graminearum Stress Responses to FgHV1 Infection.

    Science.gov (United States)

    Wang, Shuangchao; Zhang, Jingze; Li, Pengfei; Qiu, Dewen; Guo, Lihua

    2016-11-17

    Fusarium graminearum hypovirus 1 (FgHV1), which is phylogenetically related to Cryphonectria hypovirus 1 (CHV1), is a virus in the family Hypoviridae that infects the plant pathogenic fungus F. graminearum . Although hypovirus FgHV1 infection does not attenuate the virulence of the host (hypovirulence), it results in defects in mycelial growth and spore production. We now report that the vertical transmission rate of FgHV1 through asexual spores reached 100%. Using RNA deep sequencing, we performed genome-wide expression analysis to reveal phenotype-related genes with expression changes in response to FgHV1 infection. A total of 378 genes were differentially expressed, suggesting that hypovirus infection causes a significant alteration of fungal gene expression. Nearly two times as many genes were up-regulated as were down-regulated. A differentially expressed gene enrichment analysis identified a number of important pathways. Metabolic processes, the ubiquitination system, and especially cellular redox regulation were the most affected categories in F. graminearum challenged with FgHV1. The p20, encoded by FgHV1 could induce H₂O₂ accumulation and hypersensitive response in Nicotiana benthamiana leaves. Moreover, hypovirus FgHV1 may regulate transcription factors and trigger the RNA silencing pathway in F. graminearum .

  3. Discovery and investigation of anticancer ruthenium-arene Schiff-base complexes via water-promoted combinatorial three-component assembly.

    Science.gov (United States)

    Chow, Mun Juinn; Licona, Cynthia; Yuan Qiang Wong, Daniel; Pastorin, Giorgia; Gaiddon, Christian; Ang, Wee Han

    2014-07-24

    The structural diversity of metal scaffolds makes them a viable alternative to traditional organic scaffolds for drug design. Combinatorial chemistry and multicomponent reactions, coupled with high-throughput screening, are useful techniques in drug discovery, but they are rarely used in metal-based drug design. We report the optimization and validation of a new combinatorial, metal-based, three-component assembly reaction for the synthesis of a library of 442 Ru-arene Schiff-base (RAS) complexes. These RAS complexes were synthesized in a one-pot, on-a-plate format using commercially available starting materials under aqueous conditions. The library was screened for their anticancer activity, and several cytotoxic lead compounds were identified. In particular, [(η6-1,3,5-triisopropylbenzene)RuCl(4-methoxy-N-(2-quinolinylmethylene)aniline)]Cl (4) displayed low micromolar IC50 values in ovarian cancers (A2780, A2780cisR), breast cancer (MCF7), and colorectal cancer (HCT116, SW480). The absence of p53 activation or changes in IC50 value between p53+/+ and p53-/- cells suggests that 4 and possibly the other lead compounds may act independently of the p53 tumor suppressor gene frequently mutated in cancer.

  4. ‘Function-first’ Lead Discovery: Mode of Action Profiling of Natural Product Libraries Using Image-Based Screening

    Science.gov (United States)

    Schulze, Christopher J.; Bray, Walter M.; Woerhmann, Marcos H.; Stuart, Joshua; Lokey, R. Scott; Linington, Roger G.

    2013-01-01

    Summary Cytological profiling is a high-content image-based screening technology that provides insight into the mode of action (MOA) for test compounds by directly measuring hundreds of phenotypic cellular features. We have extended this recently reported technology to the mechanistic characterization of unknown natural products libraries for the direct prediction of compound MOAs at the primary screening stage. By analyzing a training set of commercial compounds of known mechanism and comparing these profiles to those obtained from natural product library members, we have successfully annotated extracts based on mode of action, dereplicated known compounds based on biological similarity to the training set, and identified and predicted the MOA of a family of new iron siderophores. Coupled with traditional analytical techniques, cytological profiling provides a new avenue for the creation of ‘function-first’ platforms for natural products discovery. PMID:23438757

  5. Segmented and Detailed Visualization of Anatomical Structures based on Augmented Reality for Health Education and Knowledge Discovery

    Directory of Open Access Journals (Sweden)

    Isabel Cristina Siqueira da Silva

    2017-05-01

    Full Text Available The evolution of technology has changed the face of education, especially when combined with appropriate pedagogical bases. This combination has created innovation opportunities in order to add quality to teaching through new perspectives for traditional methods applied in the classroom. In the Health field, particularly, augmented reality and interaction design techniques can assist the teacher in the exposition of theoretical concepts and/or concepts that need of training at specific medical procedures. Besides, visualization and interaction with Health data, from different sources and in different formats, helps to identify hidden patterns or anomalies, increases the flexibility in the search for certain values, allows the comparison of different units to obtain relative difference in quantities, provides human interaction in real time, etc. At this point, it is noted that the use of interactive visualization techniques such as augmented reality and virtual can collaborate with the process of knowledge discovery in medical and biomedical databases. This work discuss aspects related to the use of augmented reality and interaction design as a tool for teaching anatomy and knowledge discovery, with the proposition of an case study based on mobile application that can display targeted anatomical parts in high resolution and with detail of its parts.

  6. Inhibiting NF-κB-inducing kinase (NIK): discovery, structure-based design, synthesis, structure-activity relationship, and co-crystal structures.

    Science.gov (United States)

    Li, Kexue; McGee, Lawrence R; Fisher, Ben; Sudom, Athena; Liu, Jinsong; Rubenstein, Steven M; Anwer, Mohmed K; Cushing, Timothy D; Shin, Youngsook; Ayres, Merrill; Lee, Fei; Eksterowicz, John; Faulder, Paul; Waszkowycz, Bohdan; Plotnikova, Olga; Farrelly, Ellyn; Xiao, Shou-Hua; Chen, Guoqing; Wang, Zhulun

    2013-03-01

    The discovery, structure-based design, synthesis, and optimization of NIK inhibitors are described. Our work began with an HTS hit, imidazopyridinyl pyrimidinamine 1. We utilized homology modeling and conformational analysis to optimize the indole scaffold leading to the discovery of novel and potent conformationally constrained inhibitors such as compounds 25 and 28. Compounds 25 and 31 were co-crystallized with NIK kinase domain to provide structural insights. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Discovery of MYH14 as an important and unique deafness gene causing prelingually severe autosomal dominant nonsyndromic hearing loss.

    Science.gov (United States)

    Kim, Bong Jik; Kim, Ah Reum; Han, Jin Hee; Lee, Chung; Oh, Doo Yi; Choi, Byung Yoon

    2017-04-01

    Pathogenic variants of MYH14 are known to be associated (in either a syndromic or nonsyndromic manner) with hearing loss. Interestingly, all reported cases to date of MYH14-related nonsyndromic hearing loss with detailed phenotypes have demonstrated mild-to-moderate progressive hearing loss with postlingual onset. In the present study, targeted resequencing (TRS) of known deafness genes was performed to identify the causative variant in two multiplex families segregating autosomal dominant (AD) inherited hearing loss. TRS uncovered two novel variants of MYH14 (c.A572G: p.Asp191Gly in the myosin head domain and c.C73T:p.Gln25* in exon 2) from two multiplex deafness Korean families. Notably, both probands showed phenotypes of congenital or prelingual severe hearing loss. It is remarkably uncommon to encounter such a severe-to-profound, prelingual, AD hearing loss. Given that the first variant, p. Asp191Gly, was the first documented missense allele discovered in the myosin head domain of this gene related to either congenital or prelingual severe nonsyndromic hearing loss, and also that the second variant, p. Gln25*, lead to a null allele, more severe phenotypes from our probands may have been the result of either genotype-phenotype correlation or genetic backgrounds, or both. In the present study, we report that MYH14 can manifest as nonsyndromic prelingual severe sensorineural hearing loss in an AD fashion in Koreans. The results of the present study suggest that further genetic studies of similar patients should consider MYH14 as a causative gene, and cochlear implantation during infant or early childhood should be indicated for those patients with certain MYH14 pathogenic variants. Copyright © 2017 John Wiley & Sons, Ltd.

  8. Gene discovery for enzymes involved in limonene modification or utilization by the mountain pine beetle-associated pathogen Grosmannia clavigera.

    Science.gov (United States)

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg; Breuil, Colette

    2014-08-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals.

  9. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery.

    Science.gov (United States)

    Gardner, Elliot M; Johnson, Matthew G; Ragone, Diane; Wickett, Norman J; Zerega, Nyree J C

    2016-07-01

    We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

  10. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    Science.gov (United States)

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes. PMID:27437173

  11. RNAi-based silencing of genes encoding the vacuolar- ATPase ...

    African Journals Online (AJOL)

    RNAi-based silencing of genes encoding the vacuolar- ATPase subunits a and c in pink bollworm (Pectinophora gossypiella). Ahmed M. A. Mohammed. Abstract. RNA interference is a post- transcriptional gene regulation mechanism that is predominantly found in eukaryotic organisms. RNAi demonstrated a successful ...

  12. DNA-energetics-based analyses suggest additional genes in ...

    Indian Academy of Sciences (India)

    2012-06-25

    Jun 25, 2012 ... [Khandelwal G, Gupta J and Jayaram B 2012 DNA-energetics-based analyses suggest additional genes in prokaryotes. J. Biosci. 37 433–444] DOI ..... illustration for detecting potential new genes in 12 different genomes with varied GC ..... maps and genetic map of DNA double strand. J. Phys. Soc. Jpn.

  13. Antibody validation of immunohistochemistry for biomarker discovery: recommendations of a consortium of academic and pharmaceutical based histopathology researchers.

    Science.gov (United States)

    Howat, William J; Lewis, Arthur; Jones, Phillipa; Kampf, Caroline; Pontén, Fredrik; van der Loos, Chris M; Gray, Neil; Womack, Chris; Warford, Anthony

    2014-11-01

    As biomarker discovery takes centre-stage, the role of immunohistochemistry within that process is increasing. At the same time, the number of antibodies being produced for "research use" continues to rise and it is important that antibodies to be used as biomarkers are validated for specificity and sensitivity before use. This guideline seeks to provide a stepwise approach for the validation of an antibody for immunohistochemical assays, reflecting the views of a consortium of academic and pharmaceutical based histopathology researchers. We propose that antibodies are placed into a tier system, level 1-3, based on evidence of their usage in immunohistochemistry, and that the degree of validation required is proportionate to their place on that tier. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Switch in Site of Inhibition: A Strategy for Structure-Based Discovery of Human Topoisomerase IIα Catalytic Inhibitors.

    Science.gov (United States)

    Baviskar, Ashish T; Amrutkar, Suyog M; Trivedi, Neha; Chaudhary, Vikas; Nayak, Anmada; Guchhait, Sankar K; Banerjee, Uttam C; Bharatam, Prasad V; Kundu, Chanakya N

    2015-04-09

    A study of structure-based modulation of known ligands of hTopoIIα, an important enzyme involved in DNA processes, coupled with synthesis and in vitro assays led to the establishment of a strategy of rational switch in mode of inhibition of the enzyme's catalytic cycle. 6-Arylated derivatives of known imidazopyridine ligands were found to be selective inhibitors of hTopoIIα, while not showing TopoI inhibition and DNA binding. Interestingly, while the parent imidazopyridines acted as ATP-competitive inhibitors, arylated derivatives inhibited DNA cleavage similar to merbarone, indicating a switch in mode of inhibition from ATP-hydrolysis to the DNA-cleavage stage of catalytic cycle of the enzyme. The 6-aryl-imidazopyridines were relatively more cytotoxic than etoposide in cancer cells and less toxic to normal cells. Such unprecedented strategy will encourage research on "choice-based change" in target-specific mode of action for rapid drug discovery.

  15. [Study on the discovery of novel chitinase inhibitors based on natural products].

    Science.gov (United States)

    Hirose, Tomoyasu

    2012-01-01

    Chitin, the second most abundant polysaccharide in nature, is a constituent of fungal cell walls, the exoskeletons of crustaceans and insects and the microfilarial sheaths of parasitic nematodes. Chitin has, so far, not been found in mammals. Accumulation of chitin by organisms is modulated by chitin synthase-mediated biosynthesis and by chitinase-mediated hydrolytic degradation. Thus, chitinases are expected to be specific targets for antifungal, insecticidal and antiparasitic agents. Paradoxically, while chitin does not exist in mammals, human chitinase family members, such as acidic mammalian chitinase, have recently been described, and offer significant potential for the treatment of asthma and other related diseases in humans. This review covers the development of two chitinase inhibitors of natural origin, Argifin and Argadin, isolated from the cultured broth of microorganisms in our laboratory. In particular, the practical total synthesis of these natural products and discovery methods that generate only highly-active compounds using a kinetic target (chitinase)-guided synthesis approach (termed in situ click chemistry) are described.

  16. Aniracetam: its novel therapeutic potential in cerebral dysfunctional disorders based on recent pharmacological discoveries.

    Science.gov (United States)

    Nakamura, Kazuo

    2002-01-01

    Aniracetam is a pyrrolidinone-type cognition enhancer that has been clinically used in the treatment of behavioral and psychological symptoms of dementia following stroke and in Alzheimer's disease. New discoveries in the behavioral pharmacology, biochemistry and pharmacokinetics of aniracetam provided new indications for this drug in the treatment of various CNS disorders or disease states. This article reviews these new findings and describes the effects of aniracetam in various rodent models of mental function impairment or cerebral dysfunction. Also, several metabolites of aniracetam have been reported to affect learning and memory in animals. It is, therefore, conceivable that major metabolites of aniracetam contribute to its pharmacological effects. The animal models, used in pharmacological evaluation of aniracetam included models of hypoattention, hypovigilance-arousal, impulsiveness, hyperactivity, fear and anxiety, depression, impaired rapid-eye movement sleep, disturbed temporal regulation, behavioral performance, and bladder hyperactivity. These are models of clinical disorders or symptoms that may include personality disorders, anxiety, depression, posttraumatic stress disorder, attention-deficit/hyperactivity disorder, autism, negative symptoms of schizophrenia, and sleep disorders. At present, there is no convincing evidence that promising effects of aniracetam in the animal models will guarantee its clinical efficacy. It is conceivable, however, that clinical trials will demonstrate beneficial effects of aniracetam in the above listed disease states. New findings regarding the mechanism of action of aniracetam, its central target sites, and its effects on signal transduction are also discussed in this review article.

  17. Automated Sample Preparation Platform for Mass Spectrometry-Based Plasma Proteomics and Biomarker Discovery

    Directory of Open Access Journals (Sweden)

    Vilém Guryča

    2014-03-01

    Full Text Available The identification of novel biomarkers from human plasma remains a critical need in order to develop and monitor drug therapies for nearly all disease areas. The discovery of novel plasma biomarkers is, however, significantly hampered by the complexity and dynamic range of proteins within plasma, as well as the inherent variability in composition from patient to patient. In addition, it is widely accepted that most soluble plasma biomarkers for diseases such as cancer will be represented by tissue leakage products, circulating in plasma at low levels. It is therefore necessary to find approaches with the prerequisite level of sensitivity in such a complex biological matrix. Strategies for fractionating the plasma proteome have been suggested, but improvements in sensitivity are often negated by the resultant process variability. Here we describe an approach using multidimensional chromatography and on-line protein derivatization, which allows for higher sensitivity, whilst minimizing the process variability. In order to evaluate this automated process fully, we demonstrate three levels of processing and compare sensitivity, throughput and reproducibility. We demonstrate that high sensitivity analysis of the human plasma proteome is possible down to the low ng/mL or even high pg/mL level with a high degree of technical reproducibility.

  18. An Innovative Cell Microincubator for Drug Discovery Based on 3D Silicon Structures

    Directory of Open Access Journals (Sweden)

    Francesca Aredia

    2016-01-01

    Full Text Available We recently employed three-dimensional (3D silicon microstructures (SMSs consisting in arrays of 3 μm-thick silicon walls separated by 50 μm-deep, 5 μm-wide gaps, as microincubators for monitoring the biomechanical properties of tumor cells. They were here applied to investigate the in vitro behavior of HT1080 human fibrosarcoma cells driven to apoptosis by the chemotherapeutic drug Bleomycin. Our results, obtained by fluorescence microscopy, demonstrated that HT1080 cells exhibited a great ability to colonize the narrow gaps. Remarkably, HT1080 cells grown on 3D-SMS, when treated with the DNA damaging agent Bleomycin under conditions leading to apoptosis, tended to shrink, reducing their volume and mimicking the normal behavior of apoptotic cells, and were prone to leave the gaps. Finally, we performed label-free detection of cells adherent to the vertical silicon wall, inside the gap of 3D-SMS, by exploiting optical low coherence reflectometry using infrared, low power radiation. This kind of approach may become a new tool for increasing automation in the drug discovery area. Our results open new perspectives in view of future applications of the 3D-SMS as the core element of a lab-on-a-chip suitable for screening the effect of new molecules potentially able to kill tumor cells.

  19. Potential of Glutamate-Based Drug Discovery for Next Generation Antidepressants

    Directory of Open Access Journals (Sweden)

    Shigeyuki Chaki

    2015-09-01

    Full Text Available Recently, ketamine has been demonstrated to exert rapid-acting antidepressant effects in patients with depression, including those with treatment-resistant depression, and this discovery has been regarded as the most significant advance in drug development for the treatment of depression in over 50 years. To overcome unwanted side effects of ketamine, numerous approaches targeting glutamatergic systems have been vigorously investigated. For example, among agents targeting the NMDA receptor, the efficacies of selective GluN2B receptor antagonists and a low-trapping antagonist, as well as glycine site modulators such as GLYX-13 and sarcosine have been demonstrated clinically. Moreover, agents acting on metabotropic glutamate receptors, such as mGlu2/3 and mGlu5 receptors, have been proposed as useful approaches to mimicking the antidepressant effects of ketamine. Neural and synaptic mechanisms mediated through the antidepressant effects of ketamine have been being delineated, most of which indicate that ketamine improves abnormalities in synaptic transmission and connectivity observed in depressive states via the AMPA receptor and brain-derived neurotrophic factor-dependent mechanisms. Interestingly, some of the above agents may share some neural and synaptic mechanisms with ketamine. These studies should provide important insights for the development of superior pharmacotherapies for depression with more potent and faster onsets of actions.

  20. The use of time-resolved fluorescence in gel-based proteomics for improved biomarker discovery

    Science.gov (United States)

    Sandberg, AnnSofi; Buschmann, Volker; Kapusta, Peter; Erdmann, Rainer; Wheelock, Åsa M.

    2010-02-01

    This paper describes a new platform for quantitative intact proteomics, entitled Cumulative Time-resolved Emission 2-Dimensional Gel Electrophoresis (CuTEDGE). The CuTEDGE technology utilizes differences in fluorescent lifetimes to subtract the confounding background fluorescence during in-gel detection and quantification of proteins, resulting in a drastic improvement in both sensitivity and dynamic range compared to existing technology. The platform is primarily designed for image acquisition in 2-dimensional gel electrophoresis (2-DE), but is also applicable to 1-dimensional gel electrophoresis (1-DE), and proteins electroblotted to membranes. In a set of proof-of-principle measurements, we have evaluated the performance of the novel technology using the MicroTime 100 instrument (PicoQuant GmbH) in conjunction with the CyDye minimal labeling fluorochromes (GE Healthcare, Uppsala, Sweden) to perform differential gel electrophoresis (DIGE) analyses. The results indicate that the CuTEDGE technology provides an improvement in the dynamic range and sensitivity of detection of 3 orders of magnitude as compared to current state-of-the-art image acquisition instrumentation available for 2-DE (Typhoon 9410, GE Healthcare). Given the potential dynamic range of 7-8 orders of magnitude and sensitivities in the attomol range, the described invention represents a technological leap in detection of low abundance cellular proteins, which is desperately needed in the field of biomarker discovery.

  1. Cancer Transcriptome Dataset Analysis: Comparing Methods of Pathway and Gene Regulatory Network-Based Cluster Identification.

    Science.gov (United States)

    Nam, Seungyoon

    2017-04-01

    Cancer transcriptome analysis is one of the leading areas of Big Data science, biomarker, and pharmaceutical discovery, not to forget personalized medicine. Yet, cancer transcriptomics and postgenomic medicine require innovation in bioinformatics as well as comparison of the performance of available algorithms. In this data analytics context, the value of network generation and algorithms has been widely underscored for addressing the salient questions in cancer pathogenesis. Analysis of cancer trancriptome often results in complicated networks where identification of network modularity remains critical, for example, in delineating the "druggable" molecular targets. Network clustering is useful, but depends on the network topology in and of itself. Notably, the performance of different network-generating tools for network cluster (NC) identification has been little investigated to date. Hence, using gastric cancer (GC) transcriptomic datasets, we compared two algorithms for generating pathway versus gene regulatory network-based NCs, showing that the pathway-based approach better agrees with a reference set of cancer-functional contexts. Finally, by applying pathway-based NC identification to GC transcriptome datasets, we describe cancer NCs that associate with candidate therapeutic targets and biomarkers in GC. These observations collectively inform future research on cancer transcriptomics, drug discovery, and rational development of new analysis tools for optimal harnessing of omics data.

  2. False-Positive Rate Determination of Protein Target Discovery using a Covalent Modification- and Mass Spectrometry-Based Proteomics Platform

    Science.gov (United States)

    Strickland, Erin C.; Geer, M. Ariel; Hong, Jiyong; Fitzgerald, Michael C.

    2014-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. Stability of proteins from rates of oxidation (SPROX) is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false-positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False-positive rates of 1.2-2.2 % and analysis of the isobaric mass tag (e.g., iTRAQ®) reporter ions used for peptide quantitation. Our results also suggest that technical replicates can be used to effectively eliminate such false positives that result from this random error, as is demonstrated in a SPROX experiment to identify yeast protein targets of the drug, manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false-positive rate of protein target discovery using SPROX is also discussed.

  3. De novo characterization of the Dialeurodes citri transcriptome: mining genes involved in stress resistance and simple sequence repeats (SSRs) discovery.

    Science.gov (United States)

    Chen, E-H; Wei, D-D; Shen, G-M; Yuan, G-R; Bai, P-P; Wang, J-J

    2014-02-01

    The citrus whitefly, Dialeurodes citri (Ashmead), is one of the three economically important whitefly species that infest citrus plants around the world; however, limited genetic research has been focused on D. citri, partly because of lack of genomic resources. In this study, we performed de novo assembly of a transcriptome using Illumina paired-end sequencing technology (Illumina Inc., San Diego, CA, USA). In total, 36,766 unigenes with a mean length of 497 bp were identified. Of these unigenes, we identified 17,788 matched known proteins in the National Center for Biotechnology Information database, as determined by Blast search, with 5731, 4850 and 14,441 unigenes assigned to clusters of orthologous groups (COG), gene ontology (GO), and SwissProt, respectively. In total, 7507 unigenes were assigned to 308 known pathways. In-depth analysis of the data showed that 117 unigenes were identified as potentially involved in the detoxification of xenobiotics and 67 heat shock protein (Hsp) genes were associated with environmental stress. In addition, these enzymes were searched against the GO and COG database, and the results showed that the three major detoxification enzymes and Hsps were classified into 18 and 3, 6, and 8 annotations, respectively. In addition, 149 simple sequence repeats were detected. The results facilitate the investigation of molecular resistance mechanisms to insecticides and environmental stress, and contribute to molecular marker development. The findings greatly improve our genetic understanding of D. citri, and lay the foundation for future functional genomics studies on this species. © 2013 The Royal Entomological Society.

  4. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    Directory of Open Access Journals (Sweden)

    Terao Keiji

    2002-12-01

    Full Text Available Abstract Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies.

  5. Hypothesis-based analysis of gene-gene interactions and risk of myocardial infarction.

    Directory of Open Access Journals (Sweden)

    Gavin Lucas

    Full Text Available The genetic loci that have been found by genome-wide association studies to modulate risk of coronary heart disease explain only a fraction of its total variance, and gene-gene interactions have been proposed as a potential source of the remaining heritability. Given the potentially large testing burden, we sought to enrich our search space with real interactions by analyzing variants that may be more likely to interact on the basis of two distinct hypotheses: a biological hypothesis, under which MI risk is modulated by interactions between variants that are known to be relevant for its risk factors; and a statistical hypothesis, under which interacting variants individually show weak marginal association with MI. In a discovery sample of 2,967 cases of early-onset myocardial infarction (MI and 3,075 controls from the MIGen study, we performed pair-wise SNP interaction testing using a logistic regression framework. Despite having reasonable power to detect interaction effects of plausible magnitudes, we observed no statistically significant evidence of interaction under these hypotheses, and no clear consistency between the top results in our discovery sample and those in a large validation sample of 1,766 cases of coronary heart disease and 2,938 controls from the Wellcome Trust Case-Control Consortium. Our results do not support the existence of strong interaction effects as a common risk factor for MI. Within the scope of the hypotheses we have explored, this study places a modest upper limit on the magnitude that epistatic risk effects are likely to have at the population level (odds ratio for MI risk 1.3-2.0, depending on allele frequency and interaction model.

  6. Simulated JWST/NIRISS Transit Spectroscopy of Anticipated TESS Planets Compared to Select Discoveries from Space-Based and Ground-Based Surveys

    Science.gov (United States)

    Louie, Dana; Deming, Drake; Albert, Loic; Bouma, Luke; Bean, Jacob; Lopez-Morales, Mercedes

    2018-01-01

    The Transiting Exoplanet Survey Satellite (TESS) will embark in 2018 on a 2-year wide-field survey mission of most of the celestial sky, discovering over a thousand super-Earth and sub-Neptune-sized exoplanets potentially suitable for follow-up observations using the James Webb Space Telescope (JWST). Bouma et al. (2017) and Sullivan et al. (2015) used Monte Carlo simulations to predict the properties of the planetary systems that TESS is likely to detect, basing their simulations upon Kepler-derived planet occurrence rates and photometric performance models for the TESS cameras. We employed a JWST Near InfraRed Imager and Slitless Spectrograph (NIRISS) simulation tool to estimate the signal-to-noise (S/N) that JWST/NIRISS will attain in transmission spectroscopy of these anticipated TESS discoveries, and we then compared the S/N for anticipated TESS discoveries to our estimates of S/N for 18 known exoplanets. We analyzed the sensitivity of our results to planetary composition, cloud cover, and presence of an observational noise floor. We find that only a few anticipated TESS discoveries in the terrestrial planet regime will result in better JWST/NIRISS S/N than currently known exoplanets, such as the TRAPPIST-1 planets, GJ1132b, or LHS1140b. However, we emphasize that this outcome is based upon Kepler-derived occurrence rates, and that co-planar compact systems (e.g. TRAPPIST-1) were not included in predicting the anticipated TESS planet yield. Furthermore, our results show that several hundred anticipated TESS discoveries in the super-Earth and sub-Neptune regime will produce S/N higher than currently known exoplanets such as K2-3b or K2-3c. We apply our results to estimate the scope of a JWST follow-up observation program devoted to mapping the transition region between high molecular weight and primordial planetary atmospheres.

  7. SNP discovery by illumina-based transcriptome sequencing of the olive and the genetic characterization of Turkish olive genotypes revealed by AFLP, SSR and SNP markers.

    Directory of Open Access Journals (Sweden)

    Hilal Betul Kaya

    Full Text Available BACKGROUND: The olive tree (Olea europaea L. is a diploid (2n = 2x = 46 outcrossing species mainly grown in the Mediterranean area, where it is the most important oil-producing crop. Because of its economic, cultural and ecological importance, various DNA markers have been used in the olive to characterize and elucidate homonyms, synonyms and unknown accessions. However, a comprehensive characterization and a full sequence of its transcriptome are unavailable, leading to the importance of an efficient large-scale single nucleotide polymorphism (SNP discovery in olive. The objectives of this study were (1 to discover olive SNPs using next-generation sequencing and to identify SNP primers for cultivar identification and (2 to characterize 96 olive genotypes originating from different regions of Turkey. METHODOLOGY/PRINCIPAL FINDINGS: Next-generation sequencing technology was used with five distinct olive genotypes and generated cDNA, producing 126,542,413 reads using an Illumina Genome Analyzer IIx. Following quality and size trimming, the high-quality reads were assembled into 22,052 contigs with an average length of 1,321 bases and 45 singletons. The SNPs were filtered and 2,987 high-quality putative SNP primers were identified. The assembled sequences and singletons were subjected to BLAST similarity searches and annotated with a Gene Ontology identifier. To identify the 96 olive genotypes, these SNP primers were applied to the genotypes in combination with amplified fragment length polymorphism (AFLP and simple sequence repeats (SSR markers. CONCLUSIONS/SIGNIFICANCE: This study marks the highest number of SNP markers discovered to date from olive genotypes using transcriptome sequencing. The developed SNP markers will provide a useful source for molecular genetic studies, such as genetic diversity and characterization, high density quantitative trait locus (QTL analysis, association mapping and map-based gene cloning in the olive. High levels

  8. Nanotechnology-based gene-eluting stents.

    Science.gov (United States)

    Goh, Debbie; Tan, Aaron; Farhatnia, Yasmin; Rajadas, Jayakumar; Alavijeh, Mohammad S; Seifalian, Alexander M

    2013-04-01

    Cardiovascular disease is one of the major causes of death in the world. Coronary stenting in percutaneous coronary intervention (PCI) has revolutionized the field of cardiology. Coronary stenting is seen as a less invasive procedure compared to coronary artery bypass graft (CABG) surgery. Two main types of stents currently exist in the market: bare-metal stents (BMS) and drug-eluting stents (DES). DES were developed in response to problems associated with BMS use, like neointimal hyperplasia leading to restenosis. However, the use of DES engendered other problems as well, like late stent thrombosis (ST), which is a serious and lethal complication. Gene-eluting stents (GES) have recently been proposed as a novel method of circumventing problems seen in BMS and DES. Utilizing nanotechnology, sustained and localized delivery of genes can mitigate problems of restenosis and late ST by accelerating the regenerative capacity of re-endothelialization. Therefore this review seeks to explore the realm of GES as a novel alternative to BMS and DES, and its potential implications in the field of nanotechnology and regenerative medicine.

  9. Comparison of sequencing based CNV discovery methods using monozygotic twin quartets.

    Directory of Open Access Journals (Sweden)

    Marc-André Legault

    Full Text Available The advent of high throughput sequencing methods breeds an important amount of technical challenges. Among those is the one raised by the discovery of copy-number variations (CNVs using whole-genome sequencing data. CNVs are genomic structural variations defined as a variation in