WorldWideScience

Sample records for based gene discovery

  1. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  2. A Computer-Based Microarray Experiment Design-System for Gene-Regulation Pathway Discovery

    OpenAIRE

    2003-01-01

    This paper reports the methods and evaluation of a computer-based system that recommends microarray experimental design for biologists — causal discovery in Gene Expression data using Expected Value of Experimentation (GEEVE). The GEEVE system uses causal Bayesian networks and generates a decision tree for recommendations.

  3. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  4. Weighted gene co-expression based biomarker discovery for psoriasis detection.

    Science.gov (United States)

    Sundarrajan, Sudharsana; Arumugam, Mohanapriya

    2016-11-15

    Psoriasis is a chronic inflammatory disease of the skin with an unknown aetiology. The disease manifests itself as red and silvery scaly plaques distributed over the scalp, lower back and extensor aspects of the limbs. After receiving scant consideration for quite a few years, psoriasis has now become a prominent focus for new drug development. A group of closely connected and differentially co-expressed genes may act in a network and may serve as molecular signatures for an underlying phenotype. A weighted gene coexpression network analysis (WGCNA), a system biology approach has been utilized for identification of new molecular targets for psoriasis. Gene coexpression relationships were investigated in 58 psoriatic lesional samples resulting in five gene modules, clustered based on the gene coexpression patterns. The coexpression pattern was validated using three psoriatic datasets. 10 highly connected and informative genes from each module was selected and termed as psoriasis specific hub signatures. A random forest based binary classifier built using the expression profiles of signature genes robustly distinguished psoriatic samples from the normal samples in the validation set with an accuracy of 0.95 to 1. These signature genes may serve as potential candidates for biomarker discovery leading to new therapeutic targets. WGCNA, the network based approach has provided an alternative path to mine out key controllers and drivers of psoriasis. The study principle from the current work can be extended to other pathological conditions.

  5. A computer-based microarray experiment design-system for gene-regulation pathway discovery.

    Science.gov (United States)

    Yoo, Changwon; Cooper, Gregory F

    2003-01-01

    This paper reports the methods and evaluation of a computer-based system that recommends microarray experimental design for biologists - causal discovery in Gene Expression data using Expected Value of Experimentation (GEEVE). The GEEVE system uses causal Bayesian networks and generates a decision tree for recommendations. To evaluate the GEEVE system, we first built an expression simulation model based on a gene regulation model assessed by an expert biologist. Using the simulation model, we conducted a controlled study that involved 10 biologists, some of whom used GEEVE and some of whom did not. The results show that biologists who used GEEVE reached correct causal assessments about gene regulation more often than did those biologists who did not use GEEVE.

  6. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  7. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery

    Science.gov (United States)

    Moriarity, Branden S; Largaespada, David A

    2016-01-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  8. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    OpenAIRE

    Chen, Yang; Xu, Rong

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cros...

  9. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  10. Systems Pharmacology‐Based Discovery of Natural Products for Precision Oncology Through Targeting Cancer Mutated Genes

    Science.gov (United States)

    Fang, J; Cai, C; Wang, Q; Lin, P

    2017-01-01

    Massive cancer genomics data have facilitated the rapid revolution of a novel oncology drug discovery paradigm through targeting clinically relevant driver genes or mutations for the development of precision oncology. Natural products with polypharmacological profiles have been demonstrated as promising agents for the development of novel cancer therapies. In this study, we developed an integrated systems pharmacology framework that facilitated identifying potential natural products that target mutated genes across 15 cancer types or subtypes in the realm of precision medicine. High performance was achieved for our systems pharmacology framework. In case studies, we computationally identified novel anticancer indications for several US Food and Drug Administration‐approved or clinically investigational natural products (e.g., resveratrol, quercetin, genistein, and fisetin) through targeting significantly mutated genes in multiple cancer types. In summary, this study provides a powerful tool for the development of molecularly targeted cancer therapies through targeting the clinically actionable alterations by exploiting the systems pharmacology of natural products. PMID:28294568

  11. Independent Gene Discovery and Testing

    Science.gov (United States)

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  12. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Directory of Open Access Journals (Sweden)

    Oelofse Dean

    2010-04-01

    Full Text Available Abstract Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L. Walp. We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i to normalize the data effectively using spike-in control spot normalization, and (ii to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped

  13. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  14. An ensemble method for gene discovery based on DNA microarray data

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    The advent of DNA microarray technology has offered the promise of casting new insights onto deciphering secrets of life by monitoring activities of thousands of genes simultaneously.Current analyses of microarray data focus on precise classification of biological types,for example,tumor versus normal tissues.A further scientific challenging task is to extract disease-relevant genes from the bewildering amounts of raw data,which is one of the most critical themes in the post-genomic era,but it is generally ignored due to lack of an efficient approach.In this paper,we present a novel ensemble method for gene extraction that can be tailored to fulfill multiple biological tasks including(i)precise classification of biological types;(ii)disease gene mining; and(iii)target-driven gene networking.We also give a numerical application for(i)and(ii)using a public microarrary data set and set aside a separate paper to address(iii).

  15. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling

    Institute of Scientific and Technical Information of China (English)

    Xiaosheng Wang

    2013-01-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer.The cancer informatics approach is a useful supplement to the traditional experimental approach.I reviewed several reports that used a bioinformatics approach to analyze the associations among aging,stem cells,and cancer by microarray gene expression profiling.The high expression of aging-or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging,stem cells,and cancer.These mechanisms are involved in cell cycle regulation,metabolic process,DNA damage response,apoptosis,p53 signaling pathway,immune/inflammatory response,and other processes,suggesting that cancer is a developmental and evolutional disease that is strongly related to aging.Moreover,these mechanisms demonstrate that the initiation,proliferation,and metastasis of cancer are associated with the deregulation of stem cells.These findings provide insights into the biology of cancer.Certainly,the findings that are obtained by the informatics approach should be justified by experimental validation.This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  16. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling.

    Science.gov (United States)

    Wang, Xiaosheng

    2013-04-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer. The cancer informatics approach is a useful supplement to the traditional experimental approach. I reviewed several reports that used a bioinformatics approach to analyze the associations among aging, stem cells, and cancer by microarray gene expression profiling. The high expression of aging- or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging, stem cells, and cancer. These mechanisms are involved in cell cycle regulation, metabolic process, DNA damage response, apoptosis, p53 signaling pathway, immune/inflammatory response, and other processes, suggesting that cancer is a developmental and evolutional disease that is strongly related to aging. Moreover, these mechanisms demonstrate that the initiation, proliferation, and metastasis of cancer are associated with the deregulation of stem cells. These findings provide insights into the biology of cancer. Certainly, the findings that are obtained by the informatics approach should be justified by experimental validation. This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  17. A control study to evaluate a computer-based microarray experiment design recommendation system for gene-regulation pathways discovery.

    Science.gov (United States)

    Yoo, Changwon; Cooper, Gregory F; Schmidt, Martin

    2006-04-01

    The main topic of this paper is evaluating a system that uses the expected value of experimentation for discovering causal pathways in gene expression data. By experimentation we mean both interventions (e.g., a gene knock-out experiment) and observations (e.g., passively observing the expression level of a "wild-type" gene). We introduce a system called GEEVE (causal discovery in Gene Expression data using Expected Value of Experimentation), which implements expected value of experimentation in discovering causal pathways using gene expression data. GEEVE provides the following assistance, which is intended to help biologists in their quest to discover gene-regulation pathways: Recommending which experiments to perform (with a focus on "knock-out" experiments) using an expected value of experimentation (EVE) method. Recommending the number of measurements (observational and experimental) to include in the experimental design, again using an EVE method. Providing a Bayesian analysis that combines prior knowledge with the results of recent microarray experimental results to derive posterior probabilities of gene regulation relationships. In recommending which experiments to perform (and how many times to repeat them) the EVE approach considers the biologist's preferences for which genes to focus the discovery process. Also, since exact EVE calculations are exponential in time, GEEVE incorporates approximation methods. GEEVE is able to combine data from knock-out experiments with data from wild-type experiments to suggest additional experiments to perform and then to analyze the results of those microarray experimental results. It models the possibility that unmeasured (latent) variables may be responsible for some of the statistical associations among the expression levels of the genes under study. To evaluate the GEEVE system, we used a gene expression simulator to generate data from specified models of gene regulation. Using the simulator, we evaluated the GEEVE

  18. Human brain evolution: from gene discovery to phenotype discovery.

    Science.gov (United States)

    Preuss, Todd M

    2012-06-26

    The rise of comparative genomics and related technologies has added important new dimensions to the study of human evolution. Our knowledge of the genes that underwent expression changes or were targets of positive selection in human evolution is rapidly increasing, as is our knowledge of gene duplications, translocations, and deletions. It is now clear that the genetic differences between humans and chimpanzees are far more extensive than previously thought; their genomes are not 98% or 99% identical. Despite the rapid growth in our understanding of the evolution of the human genome, our understanding of the relationship between genetic changes and phenotypic changes is tenuous. This is true even for the most intensively studied gene, FOXP2, which underwent positive selection in the human terminal lineage and is thought to have played an important role in the evolution of human speech and language. In part, the difficulty of connecting genes to phenotypes reflects our generally poor knowledge of human phenotypic specializations, as well as the difficulty of interpreting the consequences of genetic changes in species that are not amenable to invasive research. On the positive side, investigations of FOXP2, along with genomewide surveys of gene-expression changes and selection-driven sequence changes, offer the opportunity for "phenotype discovery," providing clues to human phenotypic specializations that were previously unsuspected. What is more, at least some of the specializations that have been proposed are amenable to testing with noninvasive experimental techniques appropriate for the study of humans and apes.

  19. Gene discovery and molecular marker development, based on high-throughput transcript sequencing of Paspalum dilatatum Poir.

    Directory of Open Access Journals (Sweden)

    Andrea Giordano

    Full Text Available BACKGROUND: Paspalum dilatatum Poir. (common name dallisgrass is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. RESULTS: Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs. A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. CONCLUSIONS: The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression

  20. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    Science.gov (United States)

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  1. Integrated analysis of gene expression by association rules discovery

    Directory of Open Access Journals (Sweden)

    Carazo Jose M

    2006-02-01

    Full Text Available Abstract Background Microarray technology is generating huge amounts of data about the expression level of thousands of genes, or even whole genomes, across different experimental conditions. To extract biological knowledge, and to fully understand such datasets, it is essential to include external biological information about genes and gene products to the analysis of expression data. However, most of the current approaches to analyze microarray datasets are mainly focused on the analysis of experimental data, and external biological information is incorporated as a posterior process. Results In this study we present a method for the integrative analysis of microarray data based on the Association Rules Discovery data mining technique. The approach integrates gene annotations and expression data to discover intrinsic associations among both data sources based on co-occurrence patterns. We applied the proposed methodology to the analysis of gene expression datasets in which genes were annotated with metabolic pathways, transcriptional regulators and Gene Ontology categories. Automatically extracted associations revealed significant relationships among these gene attributes and expression patterns, where many of them are clearly supported by recently reported work. Conclusion The integration of external biological information and gene expression data can provide insights about the biological processes associated to gene expression programs. In this paper we show that the proposed methodology is able to integrate multiple gene annotations and expression data in the same analytic framework and extract meaningful associations among heterogeneous sources of data. An implementation of the method is included in the Engene software package.

  2. Discovery and identification of candidate sex-related genes based on transcriptome sequencing of Russian sturgeon (Acipenser gueldenstaedtii) gonads.

    Science.gov (United States)

    Chen, Yadong; Xia, Yongtao; Shao, Changwei; Han, Lei; Chen, Xuejie; Yu, Mengjun; Sha, Zhenxia

    2016-07-01

    As the Russian sturgeon (Acipenser gueldenstaedtii) is an important food and is the main source of caviar, it is necessary to discover the genes associated with its sex differentiation. However, the complicated life and maturity cycles of the Russian sturgeon restrict the accurate identification of sex in early development. To generate a first look at specific sex-related genes, we sequenced the transcriptome of gonads in different development stages (1, 2, and 5 yr old stages) with next-generation RNA sequencing. We generated >60 million raw reads, and the filtered reads were assembled into 263,341 contigs, which produced 38,505 unigenes. Genes involved in signal transduction mechanisms were the most abundant, suggesting that development of sturgeon gonads is under control of signal transduction mechanisms. Differentially expressed gene analysis suggests that more genes for protein synthesis, cytochrome c oxidase subunits, and ribosomal proteins were expressed in female gonads than in male. Meanwhile, male gonads expressed more transposable element transposase, reverse transcriptase, and transposase-related genes than female. In total, 342, 782, and 7,845 genes were detected in intersex, male, and female transcriptomes, respectively. The female gonad expressed more genes than the male gonad, and more genes were involved in female gonadal development. Genes (sox9, foxl2) are differentially expressed in different sexes and may be important sex-related genes in Russian sturgeon. Sox9 genes are responsible for the development of male gonads and foxl2 for female gonads.

  3. Mouse models for the discovery of colorectal cancer driver genes.

    Science.gov (United States)

    Clark, Christopher R; Starr, Timothy K

    2016-01-14

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC.

  4. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  5. Antibiotic resistance gene discovery in food-producing animals.

    Science.gov (United States)

    Allen, Heather K

    2014-06-01

    Numerous environmental reservoirs contribute to the widespread antibiotic resistance problem in human pathogens. One environmental reservoir of particular importance is the intestinal bacteria of food-producing animals. In this review I examine recent discoveries of antibiotic resistance genes in agricultural animals. Two types of antibiotic resistance gene discoveries will be discussed: the use of classic microbiological and molecular techniques, such as culturing and PCR, to identify known genes not previously reported in animals; and the application of high-throughput technologies, such as metagenomics, to identify novel genes and gene transfer mechanisms. These discoveries confirm that antibiotics should be limited to prudent uses.

  6. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  7. Genome-enabled Discovery of Carbon Sequestration Genes

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Tschaplinski, Timothy J [ORNL; Kalluri, Udaya C [ORNL; Yin, Tongming [ORNL; Yang, Xiaohan [ORNL; Zhang, Xinye [ORNL; Engle, Nancy L [ORNL; Ranjan, Priya [ORNL; Basu, Manojit M [ORNL; Gunter, Lee E [ORNL; Jawdy, Sara [ORNL; Martin, Madhavi Z [ORNL; Campbell, Alina S [ORNL; DiFazio, Stephen P [ORNL; Davis, John M [University of Florida; Hinchee, Maud [ORNL; Pinnacchio, Christa [U.S. Department of Energy, Joint Genome Institute; Meilan, R [Purdue University; Busov, V. [Michigan Technological University; Strauss, S [Oregon State University

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  8. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Directory of Open Access Journals (Sweden)

    Theresa A Hill

    Full Text Available The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs. Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP. Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and

  9. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Science.gov (United States)

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  10. DNA Coding Based Knowledge Discovery Algorithm

    Institute of Scientific and Technical Information of China (English)

    LI Ji-yun; GENG Zhao-feng; SHAO Shi-huang

    2002-01-01

    A novel DNA coding based knowledge discovery algorithm was proposed, an example which verified its validity was given. It is proved that this algorithm can discover new simplified rules from the original rule set efficiently.

  11. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  12. Indexer Based Dynamic Web Services Discovery

    CERN Document Server

    Bashir, Saba; Javed, M Younus; Khan, Aihab; Khiyal, Malik Sikandar Hayat

    2010-01-01

    Recent advancement in web services plays an important role in business to business and business to consumer interaction. Discovery mechanism is not only used to find a suitable service but also provides collaboration between service providers and consumers by using standard protocols. A static web service discovery mechanism is not only time consuming but requires continuous human interaction. This paper proposed an efficient dynamic web services discovery mechanism that can locate relevant and updated web services from service registries and repositories with timestamp based on indexing value and categorization for faster and efficient discovery of service. The proposed prototype focuses on quality of service issues and introduces concept of local cache, categorization of services, indexing mechanism, CSP (Constraint Satisfaction Problem) solver, aging and usage of translator. Performance of proposed framework is evaluated by implementing the algorithm and correctness of our method is shown. The results of p...

  13. De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff. and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data.

    Directory of Open Access Journals (Sweden)

    Xin-Jie Tian

    Full Text Available The perennial O. rufipogon (common wild rice, which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice.In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR and control leaves (CL and roots (CR. Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses.This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.

  14. Rice mutant resources for gene discovery

    NARCIS (Netherlands)

    Hirochika, H.; Guiderdoni, E.; An, G.; Hsing, Y.I.; Eun, M.Y.; Han, C.D.; Upadhyaya, N.; Ramachandran, S.; Zhang, Q.F.; Pereira, A.B.; Sundaresan, V.; Leung, H.

    2004-01-01

    With the completion of genomic sequencing of rice, rice has been firmly established as a model organism for both basic and applied research. The next challenge is to uncover the functions of genes predicted by sequence analysis. Considering the amount of effort and the diversity of disciplines requi

  15. Psychiatric gene discoveries shape evidence on ADHD's biology

    NARCIS (Netherlands)

    Thapar, A.; Martin, J.; Mick, E.; Arias Vasquez, A.; Langley, K.; Scherer, S.W.; Schachar, R.; Crosbie, J.; Williams, N.; Franke, B.; Elia, J.; Glessner, J.; Hakonarson, H.; Owen, M.J.; Faraone, S.V; O'Donovan, M.C.; Holmans, P.

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenes

  16. Risk genes for schizophrenia: translational opportunities for drug discovery.

    Science.gov (United States)

    Winchester, Catherine L; Pratt, Judith A; Morris, Brian J

    2014-07-01

    Despite intensive research over many years, the treatment of schizophrenia remains a major health issue. Current and emerging treatments for schizophrenia are based upon the classical dopamine and glutamate hypotheses of disease. Existing first and second generation antipsychotic drugs based upon the dopamine hypothesis are limited by their inability to treat all symptom domains and their undesirable side effect profiles. Third generation drugs based upon the glutamate hypothesis of disease are currently under evaluation but are more likely to be used as add on treatments. Hence there is a large unmet clinical need. A major challenge in neuropsychiatric disease research is the relatively limited knowledge of disease mechanisms. However, as our understanding of the genetic causes of the disease evolves, novel strategies for the development of improved therapeutic agents will become apparent. In this review we consider the current status of knowledge of the genetic basis of schizophrenia, including methods for identifying genetic variants associated with the disorder and how they impact on gene function. Although the genetic architecture of schizophrenia is complex, some targets amenable to pharmacological intervention can be discerned. We conclude that many challenges lie ahead but the stratification of patients according to biobehavioural constructs that cross existing disease classifications but with common genetic and neurobiological bases, offer opportunities for new approaches to effective drug discovery.

  17. Discovery of pinoresinol reductase genes in sphingomonads.

    Science.gov (United States)

    Fukuhara, Y; Kamimura, N; Nakajima, M; Hishiyama, S; Hara, H; Kasai, D; Tsuji, Y; Narita-Yamada, S; Nakamura, S; Katano, Y; Fujita, N; Katayama, Y; Fukuda, M; Kajita, S; Masai, E

    2013-01-10

    Bacterial genes for the degradation of major dilignols produced in lignifying xylem are expected to be useful tools for the structural modification of lignin in plants. For this purpose, we isolated pinZ involved in the conversion of pinoresinol from Sphingobium sp. strain SYK-6. pinZ showed 43-77% identity at amino acid level with bacterial NmrA-like proteins of unknown function, a subgroup of atypical short chain dehydrogenases/reductases, but revealed only 15-21% identity with plant pinoresinol/lariciresinol reductases. PinZ completely converted racemic pinoresinol to lariciresinol, showing a specific activity of 46±3 U/mg in the presence of NADPH at 30°C. In contrast, the activity for lariciresinol was negligible. This substrate preference is similar to a pinoresinol reductase, AtPrR1, of Arabidopsis thaliana; however, the specific activity of PinZ toward (±)-pinoresinol was significantly higher than that of AtPrR1. The role of pinZ and a pinZ ortholog of Novosphingobium aromaticivorans DSM 12444 were also characterized.

  18. Ontology Based Qos Driven Web Service Discovery

    Directory of Open Access Journals (Sweden)

    R Suganyakala

    2011-07-01

    Full Text Available In today's scenario web services have become a grand vision to implement the business process functionalities. With increase in number of similar web services, one of the essential challenges is to discover relevant web service with regard to user specification. Relevancy of web service discovery can be improved by augmenting semantics through expressive formats like OWL. QoS based service selection will play a significant role in meeting the non-functional user requirements. Hence QoS and semantics has been used as finer search constraints to discover the most relevant service. In this paper, we describe a QoS framework for ontology based web service discovery. The QoS factors taken into consideration are execution time, response time, throughput, scalability, reputation, accessibility and availability. The behavior of each web service at various instances is observed over a period of time and their QoS based performance is analyzed.

  19. Beegle: from literature mining to disease-gene discovery.

    Science.gov (United States)

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

  20. Implementation of BacMam virus gene delivery technology in a drug discovery setting.

    Science.gov (United States)

    Kost, Thomas A; Condreay, J Patrick; Ames, Robert S; Rees, Stephen; Romanos, Michael A

    2007-05-01

    Membrane protein targets constitute a key segment of drug discovery portfolios and significant effort has gone into increasing the speed and efficiency of pursuing these targets. However, issues still exist in routine gene expression and stable cell-based assay development for membrane proteins, which are often multimeric or toxic to host cells. To enhance cell-based assay capabilities, modified baculovirus (BacMam virus) gene delivery technology has been successfully applied to the transient expression of target proteins in mammalian cells. Here, we review the development, full implementation and benefits of this platform-based gene expression technology in support of SAR and HTS assays across GlaxoSmithKline.

  1. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  2. Gene discovery of modular diterpene metabolism in nonmodel systems.

    Science.gov (United States)

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M S; Chiang, Angela; Sandhu, Harpreet K; Madilao, Lina L; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-06-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization.

  3. Does Discovery-Based Instruction Enhance Learning?

    Science.gov (United States)

    Alfieri, Louis; Brooks, Patricia J.; Aldrich, Naomi J.; Tenenbaum, Harriet R.

    2011-01-01

    Discovery learning approaches to education have recently come under scrutiny (Tobias & Duffy, 2009), with many studies indicating limitations to discovery learning practices. Therefore, 2 meta-analyses were conducted using a sample of 164 studies: The 1st examined the effects of unassisted discovery learning versus explicit instruction, and the…

  4. Graph-Based Methods for Discovery Browsing with Semantic Predications

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej; Fiszman, Marcelo; Miller, Christopher M;

    2011-01-01

    We present an extension to literature-based discovery that goes beyond making discoveries to a principled way of navigating through selected aspects of some biomedical domain. The method is a type of "discovery browsing" that guides the user through the research literature on a specified phenomen...

  5. Metagenomics and novel gene discovery: promise and potential for novel therapeutics.

    Science.gov (United States)

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-04-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics.

  6. Inflammatory bowel disease gene discovery. CRADA final report

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-09-09

    The ultimate goal of this project is to identify the human gene(s) responsible for the disorder known as IBD. The work was planned in two phases. The desired products resulting from Phase 1 were BAC clone(s) containing the genetic marker(s) identified by gene/Networks, Inc. as potentially linked to IBD, plasmid subclones of those BAC(s), and new genetic markers developed from these plasmid subclones. The newly developed markers would be genotyped by gene/Networks, Inc. to ascertain evidence for linkage or non-linkage of IBD to this region. If non-linkage was indicated, the project would move to investigation of other candidate chromosomal regions. Where linkage was indicated, the project would move to Phase 2, in which a physical map of the candidate region(s) would be developed. The products of this phase would be contig(s) of BAC clones in the region exhibiting linkage to IBD, as well as plasmic subclones of the BACs and further genetic marker development. There would also be continued genotyping with new polymorphic markers during this phase. It was anticipated that clones identified and developed during these two phases would provide the physical resources for eventual disease gene discovery.

  7. Ribozymes: applications to functional analysis and gene discovery.

    Science.gov (United States)

    Shiota, Maki; Sano, Masayuki; Miyagishi, Makoto; Taira, Kazunari

    2004-08-01

    Ribozymes are catalytic RNA molecules that cleave RNAs with high specificity. Since the discovery of these non-protein enzymes, the rapidly developing field of ribozymes has been of particular interest because of the potential utility of ribozymes as tools for reversed genetics. However, despite extensive efforts, the activity of ribozymes in vivo has not usually been high enough to achieve the desirable biological effects. Now, by the use of RNA polymerase III (pol III) promoters, the ribozyme activity in cells has been successfully improved by developing efficient transport systems for the transcripts to the cytoplasm. In addition, it is possible to cleave a specific target RNA in cells by using an allosterically controllable ribozyme or an RNA-protein hybrid ribozyme. These ribozymes are potentially applicable to molecular gene therapy and efficient gene discovery systems. Furthermore, the developed pol III expression system is applicable to the expression of small interfering RNAs (siRNAs). The advantage of such ribozymes over siRNAs is the high specificity of the ribozyme that would not cause interferon responses.

  8. Database systems for knowledge-based discovery.

    Science.gov (United States)

    Jagarlapudi, Sarma A R P; Kishan, K V Radha

    2009-01-01

    Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.

  9. Psychiatric gene discoveries shape evidence on ADHD's biology

    Science.gov (United States)

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10−4) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  10. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  11. The Matchmaker Exchange: a platform for rare disease gene discovery.

    Science.gov (United States)

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.

  12. Ontology-based knowledge discovery in pharmacogenomics.

    Science.gov (United States)

    Coulet, Adrien; Smaïl-Tabbone, Malika; Napoli, Amedeo; Devignes, Marie-Dominique

    2011-01-01

    One current challenge in biomedicine is to analyze large amounts of complex biological data for extracting domain knowledge. This work holds on the use of knowledge-based techniques such as knowledge discovery (KD) and knowledge representation (KR) in pharmacogenomics, where knowledge units represent genotype-phenotype relationships in the context of a given treatment. An objective is to design knowledge base (KB, here also mentioned as an ontology) and then to use it in the KD process itself. A method is proposed for dealing with two main tasks: (1) building a KB from heterogeneous data related to genotype, phenotype, and treatment, and (2) applying KD techniques on knowledge assertions for extracting genotype-phenotype relationships. An application was carried out on a clinical trial concerned with the variability of drug response to montelukast treatment. Genotype-genotype and genotype-phenotype associations were retrieved together with new associations, allowing the extension of the initial KB. This experiment shows the potential of KR and KD processes, especially for designing KB, checking KB consistency, and reasoning for problem solving.

  13. Discovery of the faithfulness gene: a model of transmission and transformation of scientific information.

    Science.gov (United States)

    Green, Eva G T; Clémence, Alain

    2008-09-01

    The purpose of this paper is to study the diffusion and transformation of scientific information in everyday discussions. Based on rumour models and social representations theory, the impact of interpersonal communication and pre-existing beliefs on transmission of the content of a scientific discovery was analysed. In three experiments, a communication chain was simulated to investigate how laypeople make sense of a genetic discovery first published in a scientific outlet, then reported in a mainstream newspaper and finally discussed in groups. Study 1 (N=40) demonstrated a transformation of information when the scientific discovery moved along the communication chain. During successive narratives, scientific expert terminology disappeared while scientific information associated with lay terminology persisted. Moreover, the idea of a discovery of a faithfulness gene emerged. Study 2 (N=70) revealed that transmission of the scientific message varied as a function of attitudes towards genetic explanations of behaviour (pro-genetics vs. anti-genetics). Pro-genetics employed more scientific terminology than anti-genetics. Study 3 (N=75) showed that endorsement of genetic explanations was related to descriptive accounts of the scientific information, whereas rejection of genetic explanations was related to evaluative accounts of the information.

  14. The discovery of the microphthalmia locus and its gene, Mitf.

    Science.gov (United States)

    Arnheiter, Heinz

    2010-12-01

    The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was first cloned from a chance transgene insertion at the microphthalmia locus. And although Mitf was found to encode a member of a well-known transcription factor family, its analysis might still be in its infancy had Mitf not turned out to be of crucial importance for the physiology and pathology of many distinct organs, including eye, ear, immune system, bone, and skin, and in particular for melanoma. In fact, near seven decades of Mitf research have led to many insights about development, function, degeneration, and malignancies of a number of specific cell types, and it is hoped that these insights will one day lead to therapies benefitting those afflicted with diseases originating in these cell types.

  15. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    Directory of Open Access Journals (Sweden)

    Nadine Norton

    Full Text Available Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988 between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads. Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol

  16. Literature-based knowledge discovery: the state of the art

    CERN Document Server

    Liu, Xiaoyong

    2012-01-01

    Literature-based knowledge discovery method was introduced by Dr. Swanson in 1986. He hypothesized a connection between Raynaud's phenomenon and dietary fish oil, the field of literature-based discovery (LBD) was born from then on. During the subsequent two decades, LBD's research attracts some scientists including information science, computer science, and biomedical science, etc.. It has been a part of knowledge discovery and text mining. This paper summarizes the development of recent years about LBD and presents two parts, methodology research and applied research. Lastly, some problems are pointed as future research directions.

  17. Abiotic Stress Tolerance: From Gene Discovery in Model Organisms to Crop Improvement

    Institute of Scientific and Technical Information of China (English)

    Ray Bressan; Hans Bohnert; Jian-Kang Zhu

    2009-01-01

    Productive and sustainable agriculture necessitates growing plants in sub-optimal environments with less input of precious resources such as fresh water. For a better understanding and rapid improvement of abiotic stress tolerance, it is important to link physiological and biochemical work to molecular studies in genetically tractable model organisms. With the use of several technologies for the discovery of stress tolerance genes and their appropriate alleles,transgenic approaches to improving stress tolerance in crops remarkably parallels breeding principles with a greatly expanded germplasm base and will succeed eventually.

  18. RNA-Seq analysis and gene discovery of Andrias davidianus using Illumina short read sequencing.

    Directory of Open Access Journals (Sweden)

    Fenggang Li

    Full Text Available The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp. Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR. The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians.

  19. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    Directory of Open Access Journals (Sweden)

    Gasser Robin B

    2007-06-01

    Full Text Available Abstract Background Cholangiocarcinoma (CCA – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum than to free-living (Schmidtea mediterranea flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions

  20. Africa: the next frontier for human disease gene discovery?

    Science.gov (United States)

    Ramsay, Michèle; Tiemessen, Caroline T; Choudhury, Ananyo; Soodyall, Himla

    2011-10-15

    The populations of Africa harbour the greatest human genetic diversity following an evolutionary history tracing its beginnings on the continent to time before the emergence of Homo sapiens. Signatures of selection are detectable as responses to ancient environments and cultural practices, modulated by more recent events including infectious epidemics, migrations, admixture and, of course, chance. The age of high-throughput biology is not passing Africa by. African-based cohort studies and networks with an African footprint are ideal springboards for disease-related genetic and genomic studies. Initiatives like HapMap, the 1000 Genomes Project, MalariaGEN, the INDEPTH network and Human Heredity and Health in Africa are catalysts to exploring African genetic diversity and its role in the spectrum from health to disease. The challenges are abundant in dissecting biological questions in the light of linguistic, cultural, geographic and political boundaries and their respective roles in shaping health-related profiles. Will studies based on African populations lead to a new wave of discovery of genetic contributors to disease?

  1. Spark, an application based on Serendipitous Knowledge Discovery.

    Science.gov (United States)

    Workman, T Elizabeth; Fiszman, Marcelo; Cairelli, Michael J; Nahl, Diane; Rindflesch, Thomas C

    2016-04-01

    Findings from information-seeking behavior research can inform application development. In this report we provide a system description of Spark, an application based on findings from Serendipitous Knowledge Discovery studies and data structures known as semantic predications. Background information and the previously published IF-SKD model (outlining Serendipitous Knowledge Discovery in online environments) illustrate the potential use of information-seeking behavior in application design. A detailed overview of the Spark system illustrates how methodologies in design and retrieval functionality enable production of semantic predication graphs tailored to evoke Serendipitous Knowledge Discovery in users.

  2. SPARCoC: a new framework for molecular pattern discovery and cancer gene identification.

    Directory of Open Access Journals (Sweden)

    Shiqian Ma

    Full Text Available It is challenging to cluster cancer patients of a certain histopathological type into molecular subtypes of clinical importance and identify gene signatures directly relevant to the subtypes. Current clustering approaches have inherent limitations, which prevent them from gauging the subtle heterogeneity of the molecular subtypes. In this paper we present a new framework: SPARCoC (Sparse-CoClust, which is based on a novel Common-background and Sparse-foreground Decomposition (CSD model and the Maximum Block Improvement (MBI co-clustering technique. SPARCoC has clear advantages compared with widely-used alternative approaches: hierarchical clustering (Hclust and nonnegative matrix factorization (NMF. We apply SPARCoC to the study of lung adenocarcinoma (ADCA, an extremely heterogeneous histological type, and a significant challenge for molecular subtyping. For testing and verification, we use high quality gene expression profiling data of lung ADCA patients, and identify prognostic gene signatures which could cluster patients into subgroups that are significantly different in their overall survival (with p-values < 0.05. Our results are only based on gene expression profiling data analysis, without incorporating any other feature selection or clinical information; we are able to replicate our findings with completely independent datasets. SPARCoC is broadly applicable to large-scale genomic data to empower pattern discovery and cancer gene identification.

  3. TILLING in forage grasses for gene discovery and breeding improvement.

    Science.gov (United States)

    Manzanares, Chloe; Yates, Steven; Ruckle, Michael; Nay, Michelle; Studer, Bruno

    2016-09-25

    Mutation breeding has a long-standing history and in some major crop species, many of the most important cultivars have their origin in germplasm generated by mutation induction. For almost two decades, methods for TILLING (Targeting Induced Local Lesions IN Genomes) have been established in model plant species such as Arabidopsis (Arabidopsis thaliana L.), enabling the functional analysis of genes. Recent advances in mutation detection by second generation sequencing technology have brought its utility to major crop species. However, it has remained difficult to apply similar approaches in forage and turf grasses, mainly due to their outbreeding nature maintained by an efficient self-incompatibility system. Starting with a description of the extent to which traditional mutagenesis methods have contributed to crop yield increase in the past, this review focuses on technological approaches to implement TILLING-based strategies for the improvement of forage grass breeding through forward and reverse genetics. We present first results from TILLING in allogamous forage grasses for traits such as stress tolerance and evaluate prospects for rapid implementation of beneficial alleles to forage grass breeding. In conclusion, large-scale induced mutation resources, used for forward genetic screens, constitute a valuable tool to increase the genetic diversity for breeding and can be generated with relatively small investments in forage grasses. Furthermore, large libraries of sequenced mutations can be readily established, providing enhanced opportunities to discover mutations in genes controlling traits of agricultural importance and to study gene functions by reverse genetics.

  4. Traditional Chinese Medicine-Based Network Pharmacology Could Lead to New Multicompound Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jian Li

    2012-01-01

    Full Text Available Current strategies for drug discovery have reached a bottleneck where the paradigm is generally “one gene, one drug, one disease.” However, using holistic and systemic views, network pharmacology may be the next paradigm in drug discovery. Based on network pharmacology, a combinational drug with two or more compounds could offer beneficial synergistic effects for complex diseases. Interestingly, traditional chinese medicine (TCM has been practicing holistic views for over 3,000 years, and its distinguished feature is using herbal formulas to treat diseases based on the unique pattern classification. Though TCM herbal formulas are acknowledged as a great source for drug discovery, no drug discovery strategies compatible with the multidimensional complexities of TCM herbal formulas have been developed. In this paper, we highlighted some novel paradigms in TCM-based network pharmacology and new drug discovery. A multiple compound drug can be discovered by merging herbal formula-based pharmacological networks with TCM pattern-based disease molecular networks. Herbal formulas would be a source for multiple compound drug candidates, and the TCM pattern in the disease would be an indication for a new drug.

  5. SECURE SERVICE DISCOVERY BASED ON PROBE PACKET MECHANISM FOR MANETS

    Directory of Open Access Journals (Sweden)

    S. Pariselvam

    2015-03-01

    Full Text Available In MANETs, Service discovery process is always considered to be crucial since they do not possess a centralized infrastructure for communication. Moreover, different services available through the network necessitate varying categories. Hence, a need arises for devising a secure probe based service discovery mechanism to reduce the complexity in providing the services to the network users. In this paper, we propose a Secure Service Discovery Based on Probe Packet Mechanism (SSDPPM for identifying the DoS attack in MANETs, which depicts a new approach for estimating the level of trust present in each and every routing path of a mobile ad hoc network by using probe packets. Probing based service discovery mechanisms mainly identifies a mobile node’s genuineness using a test packet called probe that travels the entire network for the sake of computing the degree of trust maintained between the mobile nodes and it’s attributed impact towards the network performance. The performance of SSDPPM is investigated through a wide range of network related parameters like packet delivery, throughput, Control overhead and total overhead using the version ns-2.26 network simulator. This mechanism SSDPPM, improves the performance of the network in an average by 23% and 19% in terms of packet delivery ratio and throughput than the existing service discovery mechanisms available in the literature.

  6. Marinopyrroles: Unique Drug Discoveries Based on Marine Natural Products.

    Science.gov (United States)

    Li, Rongshi

    2016-01-01

    Natural products provide a successful supply of new chemical entities (NCEs) for drug discovery to treat human diseases. Approximately half of the NCEs are based on natural products and their derivatives. Notably, marine natural products, a largely untapped resource, have contributed to drug discovery and development with eight drugs or cosmeceuticals approved by the U.S. Food and Drug Administration and European Medicines Agency, and ten candidates undergoing clinical trials. Collaborative efforts from drug developers, biologists, organic, medicinal, and natural product chemists have elevated drug discoveries to new levels. These efforts are expected to continue to improve the efficiency of natural product-based drugs. Marinopyrroles are examined here as a case study for potential anticancer and antibiotic agents.

  7. Using concepts in literature-based discovery : Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries

    NARCIS (Netherlands)

    Weeber, M; Klein, H; de Jong-van den Berg, LTW; Vos, R

    2001-01-01

    Literature-based discovery has resulted in new knowledge. In the biomedical context, Don R. Swanson has generated several literature-based hypotheses that have been corroborated experimentally and clinically. In this paper, we propose a two-step model of the discovery process in which hypotheses are

  8. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar;

    2004-01-01

    Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number...... of human genes are functionally well characterized. It is still unclear how many and which human genes respond to estrogen treatment. We propose a simple, economic, yet effective computational method to predict a subclass of estrogen responsive genes. Our method relies on the similarity of ERE frames...... across different promoters in the human genome. Matching ERE frames of a test set of 60 known estrogen responsive genes to the collection of over 18,000 human promoters, we obtained 604 candidate genes. Evaluating our result by comparison with the published microarray data and literature, we found...

  9. Discovering discovery patterns with Predication-based Semantic Indexing.

    Science.gov (United States)

    Cohen, Trevor; Widdows, Dominic; Schvaneveldt, Roger W; Davies, Peter; Rindflesch, Thomas C

    2012-12-01

    In this paper we utilize methods of hyperdimensional computing to mediate the identification of therapeutically useful connections for the purpose of literature-based discovery. Our approach, named Predication-based Semantic Indexing, is utilized to identify empirically sequences of relationships known as "discovery patterns", such as "drug x INHIBITS substance y, substance y CAUSES disease z" that link pharmaceutical substances to diseases they are known to treat. These sequences are derived from semantic predications extracted from the biomedical literature by the SemRep system, and subsequently utilized to direct the search for known treatments for a held out set of diseases. Rapid and efficient inference is accomplished through the application of geometric operators in PSI space, allowing for both the derivation of discovery patterns from a large set of known TREATS relationships, and the application of these discovered patterns to constrain search for therapeutic relationships at scale. Our results include the rediscovery of discovery patterns that have been constructed manually by other authors in previous research, as well as the discovery of a set of previously unrecognized patterns. The application of these patterns to direct search through PSI space results in better recovery of therapeutic relationships than is accomplished with models based on distributional statistics alone. These results demonstrate the utility of efficient approximate inference in geometric space as a means to identify therapeutic relationships, suggesting a role of these methods in drug repurposing efforts. In addition, the results provide strong support for the utility of the discovery pattern approach pioneered by Hristovski and his colleagues.

  10. Resource Discovery in Activity-Based Sensor Networks

    DEFF Research Database (Denmark)

    Bucur, Doina; Bardram, Jakob

    This paper proposes a service discovery protocol for sensor networks that is specifically tailored for use in humancentered pervasive environments. It uses the high-level concept of computational activities (as logical bundles of data and resources) to give sensors in Activity-Based Sensor Networ...

  11. Strategic Applications of Gene Expression: From Drug Discovery/Development to Bedside

    OpenAIRE

    Bai, Jane P. F.; Alekseyenko, Alexander V.; Statnikov, Alexander; Wang, I-Ming; Wong, Peggy H.

    2013-01-01

    Gene expression is useful for identifying the molecular signature of a disease and for correlating a pharmacodynamic marker with the dose-dependent cellular responses to exposure of a drug. Gene expression offers utility to guide drug discovery by illustrating engagement of the desired cellular pathways/networks, as well as avoidance of acting on the toxicological pathways. Successful employment of gene-expression signatures in the later stages of drug development depends on their linkage to ...

  12. Structural choice based on knowledge discovery system

    Institute of Scientific and Technical Information of China (English)

    邢方亮; 王光远

    2002-01-01

    Structural choice is a significant decision having an important influence on structural function, socialeconomics, structural reliability and construction cost. A Case Based Reasoning system with its retrieval partconstructed with a KDD subsystem, is put forward to make a decision for a large scale engineering project. Atypical CBR system consists of four parts: case representation, case retriever, evaluation, and adaptation. Acase library is a set of parameterized excellent and successful structures. For a structural choice, the key pointis that the system must be able to detect the pattern classes hidden in the case library and classify the input pa-rameters into classes properly. That is done by using the KDD Data Mining algorithm based on Self-OrganizingFeature Maps ( SOFM), which makes the whole system more adaptive, self-organizing, self-learning and open.

  13. Validation of Context Based Service Discovery Protocol for Ubiquitous Applications

    Directory of Open Access Journals (Sweden)

    Anandi Giridharan

    2012-11-01

    Full Text Available Service Discovery Protocol (SDP is important in ubiquitous applications, where a large number of devicesand software components collaborate unobtrusively and provide numerous services without userintervention. Existing service discovery schemes use a service matching process in order to offer services ofinterest to the users. Potentially, the context information of the users and surrounding environment can beused to improve the quality of service matching. We propose a C-IOB (Context- Information, Observationand Belief based service discovery model, which deals with the above challenges by processing the contextinformation and by formulating the beliefs based on the basis of observations. With these formulated beliefsthe required services will be provided to the users. In this work, we present an approach for automatedvalidation of C-IOB based service discovery model in a typical ubiquitous museum environment, where theexternal behavior of the system can be predicted and compared to a model of expected behavior from theoriginal requirements. Formal specification using SDL (Specification and Description Language basedsystem has been used to conduct verification and validation of the system. The purpose of this framework isto provide a formal basis for their performance evaluation and behavioral study of the SDP.

  14. Phylogeny based discovery of regulatory elements

    Directory of Open Access Journals (Sweden)

    Cohen Barak A

    2006-05-01

    Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.

  15. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    Directory of Open Access Journals (Sweden)

    Broberg Per

    2005-08-01

    Full Text Available Abstract Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information

  16. Advances in tau-based drug discovery

    Science.gov (United States)

    Noble, Wendy; Pooler, Amy M.; Hanger, Diane P.

    2011-01-01

    Introduction Tauopathies, including Alzheimer’s disease (AD) and some frontotemporal dementias, are neurodegenerative diseases characterised by pathological lesions comprised of tau protein. There is currently a significant and urgent unmet need for disease-modifying therapies for these conditions and recently attention has turned to tau as a potential target for intervention. Areas covered Increasing evidence has highlighted pathways associated with tau-mediated neurodegeneration as important targets for drug development. Here, the authors review recently published papers in this area and summarise the genetic and pharmacological approaches that have shown efficacy in reducing tau-associated neurodegeneration. These include the use of agents to prevent abnormal tau processing and increase tau clearance, therapies targeting the immune system, and the manipulation of tau pre-mRNA to modify tau isoform expression. Expert opinion Several small molecule tau-based treatments are currently being assessed in clinical trials, the outcomes of which are eagerly awaited. Current evidence suggests that therapies targeting tau are likely, at least in part, to form the basis of an effective and safe treatment for Alzheimer’s disease and related neurodegenerative disorders in which tau deposition is evident. PMID:22003359

  17. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  18. Gene discovery in the horned beetle Onthophagus taurus

    Directory of Open Access Journals (Sweden)

    Yang Youngik

    2010-12-01

    Full Text Available Abstract Background Horned beetles, in particular in the genus Onthophagus, are important models for studies on sexual selection, biological radiations, the origin of novel traits, developmental plasticity, biocontrol, conservation, and forensic biology. Despite their growing prominence as models for studying both basic and applied questions in biology, little genomic or transcriptomic data are available for this genus. We used massively parallel pyrosequencing (Roche 454-FLX platform to produce a comprehensive EST dataset for the horned beetle Onthophagus taurus. To maximize sequence diversity, we pooled RNA extracted from a normalized library encompassing diverse developmental stages and both sexes. Results We used 454 pyrosequencing to sequence ESTs from all post-embryonic stages of O. taurus. Approximately 1.36 million reads assembled into 50,080 non-redundant sequences encompassing a total of 26.5 Mbp. The non-redundant sequences match over half of the genes in Tribolium castaneum, the most closely related species with a sequenced genome. Analyses of Gene Ontology annotations and biochemical pathways indicate that the O. taurus sequences reflect a wide and representative sampling of biological functions and biochemical processes. An analysis of sequence polymorphisms revealed that SNP frequency was negatively related to overall expression level and the number of tissue types in which a given gene is expressed. The most variable genes were enriched for a limited number of GO annotations whereas the least variable genes were enriched for a wide range of GO terms directly related to fitness. Conclusions This study provides the first large-scale EST database for horned beetles, a much-needed resource for advancing the study of these organisms. Furthermore, we identified instances of gene duplications and alternative splicing, useful for future study of gene regulation, and a large number of SNP markers that could be used in population

  19. Literature mining for the discovery of hidden connections between drugs, genes and diseases.

    Science.gov (United States)

    Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

    2010-09-23

    The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.

  20. Gene Expression Data Knowledge Discovery using Global and Local Clustering

    CERN Document Server

    H, Swathi

    2010-01-01

    To understand complex biological systems, the research community has produced huge corpus of gene expression data. A large number of clustering approaches have been proposed for the analysis of gene expression data. However, extracting important biological knowledge is still harder. To address this task, clustering techniques are used. In this paper, hybrid Hierarchical k-Means algorithm is used for clustering and biclustering gene expression data is used. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. A validation technique, Figure of Merit is used to determine the quality of clustering results. Appropriate knowledge is mined from the clusters by embedding a BLAST similarity search program into the clustering and biclustering process. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. To determine the quality of clustering results, a validation technique, Figure of Merit is used. Appropriate ...

  1. Discovery of Novel Gene Elements Associated with Prostate Cancer Progression

    Science.gov (United States)

    2012-10-01

    transcripts more closely, we performed 5’ and 3’ rapid amplification of cDNA ends (RACE) for PCAT-1 and PCAT-14. Interestingly, the PCAT-14 locus...Sequencing Core. RNA-ligase-mediated rapid amplification of cDNA ends (RACE) 5’ and 3’ RACE was performed using the GeneRacer RLM-RACE kit (Invitrogen

  2. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Cai-Ping Feng; John Mundy

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions,TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also discussed.

  3. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    DEFF Research Database (Denmark)

    Feng, Cai-ping; Mundy, J.

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions, TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also...

  4. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  5. Recent advances in genome-based polyketide discovery.

    Science.gov (United States)

    Helfrich, Eric J N; Reiter, Silke; Piel, Jörn

    2014-10-01

    Polyketides are extraordinarily diverse secondary metabolites of great pharmacological value and with interesting ecological functions. The post-genomics era has led to fundamental changes in natural product research by inverting the workflow of secondary metabolite discovery. As opposed to traditional bioactivity-guided screenings, genome mining is an in silico method to screen and analyze sequenced genomes for natural product biosynthetic gene clusters. Since genes for known compounds can be recognized at the early computational stage, genome mining presents an opportunity for dereplication. This review highlights recent progress in bioinformatics, pathway engineering and chemical analytics to extract the biosynthetic secrets hidden in the genome of both well-known natural product sources as well as previously neglected bacteria.

  6. Web Service Description and Discovery Based on Semantic Model

    Institute of Scientific and Technical Information of China (English)

    YANG Xuemei; XU Lizhen; DONG Yisheng; WANG Yongli

    2006-01-01

    A novel semantic model of Web service description and discovery was proposed through an extension for profile model of Web ontology language for services (OWL-S) in this paper.Similarity matching of Web services was implemented through computing weighted summation of semantic similarity value based on specific domain ontology and dynamical satisfy extent evaluation for quality of service (QoS).Experiments show that the provided semantic matching model is efficient.

  7. Gene discovery using mutagen-induced polymorphisms and deep sequencing: application to plant disease resistance.

    Science.gov (United States)

    Zhu, Ying; Mang, Hyung-gon; Sun, Qi; Qian, Jun; Hipps, Ashley; Hua, Jian

    2012-09-01

    Next-generation sequencing technologies are accelerating gene discovery by combining multiple steps of mapping and cloning used in the traditional map-based approach into one step using DNA sequence polymorphisms existing between two different accessions/strains/backgrounds of the same species. The existing next-generation sequencing method, like the traditional one, requires the use of a segregating population from a cross of a mutant organism in one accession with a wild-type (WT) organism in a different accession. It therefore could potentially be limited by modification of mutant phenotypes in different accessions and/or by the lengthy process required to construct a particular mapping parent in a second accession. Here we present mapping and cloning of an enhancer mutation with next-generation sequencing on bulked segregants in the same accession using sequence polymorphisms induced by a chemical mutagen. This method complements the conventional cloning approach and makes forward genetics more feasible and powerful in molecularly dissecting biological processes in any organisms. The pipeline developed in this study can be used to clone causal genes in background of single mutants or higher order of mutants and in species with or without sequence information on multiple accessions.

  8. Cross-pollination of research findings, although uncommon, may accelerate discovery of human disease genes

    Directory of Open Access Journals (Sweden)

    Duda Marlena

    2012-11-01

    Full Text Available Abstract Background Technological leaps in genome sequencing have resulted in a surge in discovery of human disease genes. These discoveries have led to increased clarity on the molecular pathology of disease and have also demonstrated considerable overlap in the genetic roots of human diseases. In light of this large genetic overlap, we tested whether cross-disease research approaches lead to faster, more impactful discoveries. Methods We leveraged several gene-disease association databases to calculate a Mutual Citation Score (MCS for 10,853 pairs of genetically related diseases to measure the frequency of cross-citation between research fields. To assess the importance of cooperative research, we computed an Individual Disease Cooperation Score (ICS and the average publication rate for each disease. Results For all disease pairs with one gene in common, we found that the degree of genetic overlap was a poor predictor of cooperation (r2=0.3198 and that the vast majority of disease pairs (89.56% never cited previous discoveries of the same gene in a different disease, irrespective of the level of genetic similarity between the diseases. A fraction (0.25% of the pairs demonstrated cross-citation in greater than 5% of their published genetic discoveries and 0.037% cross-referenced discoveries more than 10% of the time. We found strong positive correlations between ICS and publication rate (r2=0.7931, and an even stronger correlation between the publication rate and the number of cross-referenced diseases (r2=0.8585. These results suggested that cross-disease research may have the potential to yield novel discoveries at a faster pace than singular disease research. Conclusions Our findings suggest that the frequency of cross-disease study is low despite the high level of genetic similarity among many human diseases, and that collaborative methods may accelerate and increase the impact of new genetic discoveries. Until we have a better

  9. Data Mining and Knowledge Discovery via Logic-Based Methods

    CERN Document Server

    Triantaphyllou, Evangelos

    2010-01-01

    There are many approaches to data mining and knowledge discovery (DM&KD), including neural networks, closest neighbor methods, and various statistical methods. This monograph, however, focuses on the development and use of a novel approach, based on mathematical logic, that the author and his research associates have worked on over the last 20 years. The methods presented in the book deal with key DM&KD issues in an intuitive manner and in a natural sequence. Compared to other DM&KD methods, those based on mathematical logic offer a direct and often intuitive approach for extracting easily int

  10. An integrated approach to blood-based cancer diagnosis and biomarker discovery.

    Science.gov (United States)

    Min, Martin Renqiang; Chowdhury, Salim; Qi, Yanjun; Stewart, Alex; Ostroff, Rachel

    2014-01-01

    Disrupted or abnormal biological processes responsible for cancers often quantitatively manifest as disrupted additive and multiplicative interactions of gene/protein expressions correlating with cancer progression. However, the examination of all possible combinatorial interactions between gene features in most case-control studies with limited training data is computationally infeasible. In this paper, we propose a practically feasible data integration approach, QUIRE (QUadratic Interactions among infoRmative fEatures), to identify discriminative complex interactions among informative gene features for cancer diagnosis and biomarker discovery directly based on patient blood samples. QUIRE works in two stages, where it first identifies functionally relevant gene groups for the disease with the help of gene functional annotations and available physical protein interactions, then it explores the combinatorial relationships among the genes from the selected informative groups. Based on our private experimentally generated data from patient blood samples using a novel SOMAmer (Slow Off-rate Modified Aptamer) technology, we apply QUIRE to cancer diagnosis and biomarker discovery for Renal Cell Carcinoma (RCC) and Ovarian Cancer (OVC). To further demonstrate the general applicability of our approach, we also apply QUIRE to a publicly available Colorectal Cancer (CRC) dataset that can be used to prioritize our SOMAmer design. Our experimental results show that QUIRE identifies gene-gene interactions that can better identify the different cancer stages of samples, as compared to other state-of-the-art feature selection methods. A literature survey shows that many of the interactions identified by QUIRE play important roles in the development of cancer.

  11. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    Energy Technology Data Exchange (ETDEWEB)

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  12. Melody-based knowledge discovery in musical pieces

    Science.gov (United States)

    Rybnik, Mariusz; Jastrzebska, Agnieszka

    2016-06-01

    The paper is focused on automated knowledge discovery in musical pieces, based on transformations of digital musical notation. Usually a single musical piece is analyzed, to discover the structure as well as traits of separate voices. Melody and rhythm is processed with the use of three proposed operators, that serve as meta-data. In this work we focus on melody, so the processed data is labeled using fuzzy labels, created for detecting various voice characteristics. A comparative analysis of two musical pieces may be performed as well, that compares them in terms of various rhythmic or melodic traits (as a whole or with voice separation).

  13. A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.

    Science.gov (United States)

    Kothari, Cartik R; Payne, Philip R O

    2015-01-01

    In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.

  14. TCM-based new drug discovery and development in China.

    Science.gov (United States)

    Wu, Wan-Ying; Hou, Jin-Jun; Long, Hua-Li; Yang, Wen-Zhi; Liang, Jian; Guo, De-An

    2014-04-01

    Over the past 30 years, China has significantly improved the drug development environment by establishing a series of policies for the regulation of new drug approval. The regulatory system for new drug evaluation and registration in China was gradually developed in accordance with international standards. The approval and registration of TCM in China became as strict as those of chemical drugs and biological products. In this review, TCM-based new drug discovery and development are introduced according to the TCM classification of nine categories.

  15. Functional Gene Discovery and Characterization of Genes and Alleles Affecting Wood Biomass Yield and Quality in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Busov, Victor [Michigan Technological Univ., Houghton, MI (United States)

    2017-02-12

    Adoption of biofuels as economically and environmentally viable alternative to fossil fuels would require development of specialized bioenergy varieties. A major goal in the breeding of such varieties is the improvement of lignocellulosic biomass yield and quality. These are complex traits and understanding the underpinning molecular mechanism can assist and accelerate their improvement. This is particularly important for tree bioenergy crops like poplars (species and hybrids from the genus Populus), for which breeding progress is extremely slow due to long generation cycles. A variety of approaches have been already undertaken to better understand the molecular bases of biomass yield and quality in poplar. An obvious void in these undertakings has been the application of mutagenesis. Mutagenesis has been instrumental in the discovery and characterization of many plant traits including such that affect biomass yield and quality. In this proposal we use activation tagging to discover genes that can significantly affect biomass associated traits directly in poplar, a premier bioenergy crop. We screened a population of 5,000 independent poplar activation tagging lines under greenhouse conditions for a battery of biomass yield traits. These same plants were then analyzed for changes in wood chemistry using pyMBMS. As a result of these screens we have identified nearly 800 mutants, which are significantly (P<0.05) different when compared to wild type. Of these majority (~700) are affected in one of ten different biomass yield traits and 100 in biomass quality traits (e.g., lignin, S/G ration and C6/C5 sugars). We successfully recovered the position of the tag in approximately 130 lines, showed activation in nearly half of them and performed recapitulation experiments with 20 genes prioritized by the significance of the phenotype. Recapitulation experiments are still ongoing for many of the genes but the results are encouraging. For example, we have shown successful

  16. From mouse to humans: discovery of the CACNG2 pain susceptibility gene.

    Science.gov (United States)

    Nissenbaum, J

    2012-10-01

    Chronic pain is a major healthcare problem affecting the daily lives of millions with enormous financial costs. The notorious variability and lack of efficient pain relief pharmaceuticals provide both genetic and therapeutic challenge. There are several genetic approaches that aim to uncover the molecular nature of pain phenotypes into their genetic components. Gene mapping using model organisms for various pain phenotypes has led to the identification of novel genes affecting susceptibility and response to pain stimuli. Translational studies have succeeded to tie those genes to human pain syndromes, thus suggesting new targets for drug discovery. In this short review, a perspective on pain genetics and the trajectory from pain phenotype to pain gene involving fine-mapping strategies, bioinformatic analysis and microarray profiling alongside human association analysis will be introduced. This integrated approach has led to identification of CACNG2 as a novel neuropathic pain gene affecting pain susceptibility both in mice and humans. It also serves as a prototype for efficient and economic discovery of pain genes. Comparisons to other methods as well as future directions of pain genetics will be discussed as well.

  17. Aptamer-based multiplexed proteomic technology for biomarker discovery.

    Directory of Open Access Journals (Sweden)

    Larry Gold

    Full Text Available BACKGROUND: The interrogation of proteomes ("proteomics" in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. METHODOLOGY/PRINCIPAL FINDINGS: We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma. Our current assay measures 813 proteins with low limits of detection (1 pM median, 7 logs of overall dynamic range (~100 fM-1 µM, and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD. We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. CONCLUSIONS/SIGNIFICANCE: We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next

  18. Syn-lethality: an integrative knowledge base of synthetic lethality towards discovery of selective anticancer therapies.

    Science.gov (United States)

    Li, Xue-juan; Mishra, Shital K; Wu, Min; Zhang, Fan; Zheng, Jie

    2014-01-01

    Synthetic lethality (SL) is a novel strategy for anticancer therapies, whereby mutations of two genes will kill a cell but mutation of a single gene will not. Therefore, a cancer-specific mutation combined with a drug-induced mutation, if they have SL interactions, will selectively kill cancer cells. While numerous SL interactions have been identified in yeast, only a few have been known in human. There is a pressing need to systematically discover and understand SL interactions specific to human cancer. In this paper, we present Syn-Lethality, the first integrative knowledge base of SL that is dedicated to human cancer. It integrates experimentally discovered and verified human SL gene pairs into a network, associated with annotations of gene function, pathway, and molecular mechanisms. It also includes yeast SL genes from high-throughput screenings which are mapped to orthologous human genes. Such an integrative knowledge base, organized as a relational database with user interface for searching and network visualization, will greatly expedite the discovery of novel anticancer drug targets based on synthetic lethality interactions. The database can be downloaded as a stand-alone Java application.

  19. Knowledge based cluster ensemble for cancer discovery from biomolecular data.

    Science.gov (United States)

    Yu, Zhiwen; Wongb, Hau-San; You, Jane; Yang, Qinmin; Liao, Hongying

    2011-06-01

    The adoption of microarray techniques in biological and medical research provides a new way for cancer diagnosis and treatment. In order to perform successful diagnosis and treatment of cancer, discovering and classifying cancer types correctly is essential. Class discovery is one of the most important tasks in cancer classification using biomolecular data. Most of the existing works adopt single clustering algorithms to perform class discovery from biomolecular data. However, single clustering algorithms have limitations, which include a lack of robustness, stability, and accuracy. In this paper, we propose a new cluster ensemble approach called knowledge based cluster ensemble (KCE) which incorporates the prior knowledge of the data sets into the cluster ensemble framework. Specifically, KCE represents the prior knowledge of a data set in the form of pairwise constraints. Then, the spectral clustering algorithm (SC) is adopted to generate a set of clustering solutions. Next, KCE transforms pairwise constraints into confidence factors for these clustering solutions. After that, a consensus matrix is constructed by considering all the clustering solutions and their corresponding confidence factors. The final clustering result is obtained by partitioning the consensus matrix. Comparison with single clustering algorithms and conventional cluster ensemble approaches, knowledge based cluster ensemble approaches are more robust, stable and accurate. The experiments on cancer data sets show that: 1) KCE works well on these data sets; 2) KCE not only outperforms most of the state-of-the-art single clustering algorithms, but also outperforms most of the state-of-the-art cluster ensemble approaches.

  20. Transcriptome profiling for discovery of genes involved in shoot apical meristem and flower development

    Directory of Open Access Journals (Sweden)

    Vikash K. Singh

    2014-12-01

    Full Text Available Flower development is one of the major developmental processes that governs seed setting in angiosperms. However, little is known about the molecular mechanisms underlying flower development in legumes. Employing RNA-seq for various stages of flower development and few vegetative tissues in chickpea, we identified differentially expressed genes in flower tissues/stages in comparison to vegetative tissues, which are related to various biological processes and molecular functions during flower development. Here, we provide details of experimental methods, RNA-seq data (available at Gene Expression Omnibus database under GSE42679 and analysis pipeline published by Singh and colleagues in the Plant Biotechnology Journal (Singh et al., 2013, along with additional analysis for discovery of genes involved in shoot apical meristem (SAM development. Our data provide a resource for exploring the complex molecular mechanisms underlying SAM and flower development and identification of gene targets for functional and applied genomics in legumes.

  1. Targeting metalloproteins by fragment-based lead discovery.

    Science.gov (United States)

    Johnson, Sherida; Barile, Elisa; Farina, Biancamaria; Purves, Angela; Wei, Jun; Chen, Li-Hsing; Shiryaev, Sergey; Zhang, Ziming; Rodionova, Irina; Agrawal, Arpita; Cohen, Seth M; Osterman, Andrei; Strongin, Alex; Pellecchia, Maurizio

    2011-08-01

    It has been estimated that nearly one-third of functional proteins contain a metal ion. These constitute a wide variety of possible drug targets including metalloproteinases, dehydrogenases, oxidoreductases, hydrolases, deacetylases, or many others in which the metal ion is either of catalytic or of structural nature. Despite the predominant role of a metal ion in so many classes of drug targets, current high-throughput screening techniques do not usually produce viable hits against these proteins, likely due to the lack of proper metal-binding pharmacophores in the current screening libraries. Herein, we describe a novel fragment-based drug discovery approach using a metal-targeting fragment library that is based on a variety of distinct classes of metal-binding groups designed to reliably anchor the fragments at the target's metal ions. We show that the approach can effectively identify novel, potent and selective agents that can be readily developed into metalloprotein-targeted therapeutics.

  2. Exceptional knowledge discovery in databases based on information theory

    Energy Technology Data Exchange (ETDEWEB)

    Suzuki, Einoshin [Yokohama National Univ. (Japan); Shimura, Masamichi [Tokyo Inst. of Technology (Japan)

    1996-12-31

    This paper presents an algorithm for discovering exceptional knowledge from databases. Exceptional knowledge, which is defined as an exception to a general fact, exhibits unexpectedness and is sometimes extremely useful in spite of its obscurity. Previous discovery approaches for this type of knowledge employ either background knowledge or domain-specific criteria for evaluating the possible usefulness, i.e. the interestingness of the knowledge extracted from a database. It has been pointed out, however, that these approaches are prone to overlook useful knowledge. In order to circumvent these difficulties, we propose an information-theoretic approach in which we obtain exceptional knowledge associated with general knowledge in the form of a rule pair using a depth-first search method. The product of the ACEs (Average Compressed Entropies) of the rule pair is introduced as the criterion for evaluating the interestingness of exceptional knowledge. The inefficiency of depth-first search is alleviated by a branch-and-bound method, which exploits the upper-bound for the product of the ACEs. MEPRO, which is a knowledge discovery system based on our approach, has been validated using the benchmark databases in the machine learning community.

  3. Discovery and development of DNA methylation-based biomarkers for lung cancer.

    Science.gov (United States)

    Walter, Kimberly; Holcomb, Thomas; Januario, Tom; Yauch, Robert L; Du, Pan; Bourgon, Richard; Seshagiri, Somasekar; Amler, Lukas C; Hampton, Garret M; S Shames, David

    2014-02-01

    Lung cancer remains the primary cause of cancer-related deaths worldwide. Improved tools for early detection and therapeutic stratification would be expected to increase the survival rate for this disease. Alterations in the molecular pathways that drive lung cancer, which include epigenetic modifications, may provide biomarkers to help address this major unmet clinical need. Epigenetic changes, which are defined as heritable changes in gene expression that do not alter the primary DNA sequence, are one of the hallmarks of cancer, and prevalent in all types of cancer. These modifications represent a rich source of biomarkers that have the potential to be implemented in clinical practice. This perspective describes recent advances in the discovery of epigenetic biomarkers in lung cancer, specifically those that result in the methylation of DNA at CpG sites. We discuss one approach for methylation-based biomarker assay development that describes the discovery at a genome-scale level, which addresses some of the practical considerations for design of assays that can be implemented in the clinic. We emphasize that an integrated technological approach will enable the development of clinically useful DNA methylation-based biomarker assays. While this article focuses on current literature and primary research findings in lung cancer, the principles we describe here apply to the discovery and development of epigenetic biomarkers for other types of cancer.

  4. Parallel Density-Based Clustering for Discovery of Ionospheric Phenomena

    Science.gov (United States)

    Pankratius, V.; Gowanlock, M.; Blair, D. M.

    2015-12-01

    Ionospheric total electron content maps derived from global networks of dual-frequency GPS receivers can reveal a plethora of ionospheric features in real-time and are key to space weather studies and natural hazard monitoring. However, growing data volumes from expanding sensor networks are making manual exploratory studies challenging. As the community is heading towards Big Data ionospheric science, automation and Computer-Aided Discovery become indispensable tools for scientists. One problem of machine learning methods is that they require domain-specific adaptations in order to be effective and useful for scientists. Addressing this problem, our Computer-Aided Discovery approach allows scientists to express various physical models as well as perturbation ranges for parameters. The search space is explored through an automated system and parallel processing of batched workloads, which finds corresponding matches and similarities in empirical data. We discuss density-based clustering as a particular method we employ in this process. Specifically, we adapt Density-Based Spatial Clustering of Applications with Noise (DBSCAN). This algorithm groups geospatial data points based on density. Clusters of points can be of arbitrary shape, and the number of clusters is not predetermined by the algorithm; only two input parameters need to be specified: (1) a distance threshold, (2) a minimum number of points within that threshold. We discuss an implementation of DBSCAN for batched workloads that is amenable to parallelization on manycore architectures such as Intel's Xeon Phi accelerator with 60+ general-purpose cores. This manycore parallelization can cluster large volumes of ionospheric total electronic content data quickly. Potential applications for cluster detection include the visualization, tracing, and examination of traveling ionospheric disturbances or other propagating phenomena. Acknowledgments. We acknowledge support from NSF ACI-1442997 (PI V. Pankratius).

  5. Music snippet extraction via melody-based repeated pattern discovery

    Institute of Scientific and Technical Information of China (English)

    XU JiePing; ZHAO Yang; CHEN Zhe; LIU ZiLi

    2009-01-01

    In this paper, we present a complete set of procedures to automatically extract a music snippet, defined as the most representative or the highlighted excerpt of a music clip. We first generate a modified and compact similarity matrix based on selected features and distance metrics, and then several improved techniques for music repeated pattern discovery are utilized because a music snippet is usually a part of the repeated melody, main theme or chorus. During the process, redundant and wrongly detected patterns are discarded, boundaries are corrected using beat information, and final clusters are also further sorted according to the occurrence frequency and energy information. Subsequently, following our methods, we designed a music snippet extraction system which allows users to detect snippets. Experiments performed on the system show the superiority of our proposed approach.

  6. Native Mass Spectrometry in Fragment-Based Drug Discovery

    Directory of Open Access Journals (Sweden)

    Liliana Pedro

    2016-07-01

    Full Text Available The advent of native mass spectrometry (MS in 1990 led to the development of new mass spectrometry instrumentation and methodologies for the analysis of noncovalent protein–ligand complexes. Native MS has matured to become a fast, simple, highly sensitive and automatable technique with well-established utility for fragment-based drug discovery (FBDD. Native MS has the capability to directly detect weak ligand binding to proteins, to determine stoichiometry, relative or absolute binding affinities and specificities. Native MS can be used to delineate ligand-binding sites, to elucidate mechanisms of cooperativity and to study the thermodynamics of binding. This review highlights key attributes of native MS for FBDD campaigns.

  7. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  8. Gene Discovery of Modular Diterpene Metabolism in Nonmodel Systems1[W][OA

    Science.gov (United States)

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M.S.; Chiang, Angela; Sandhu, Harpreet K.; Madilao, Lina L.; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-01-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  9. FORGE Canada Consortium: outcomes of a 2-year national rare-disease gene-discovery project.

    Science.gov (United States)

    Beaulieu, Chandree L; Majewski, Jacek; Schwartzentruber, Jeremy; Samuels, Mark E; Fernandez, Bridget A; Bernier, Francois P; Brudno, Michael; Knoppers, Bartha; Marcadier, Janet; Dyment, David; Adam, Shelin; Bulman, Dennis E; Jones, Steve J M; Avard, Denise; Nguyen, Minh Thu; Rousseau, Francois; Marshall, Christian; Wintle, Richard F; Shen, Yaoqing; Scherer, Stephen W; Friedman, Jan M; Michaud, Jacques L; Boycott, Kym M

    2014-06-01

    Inherited monogenic disease has an enormous impact on the well-being of children and their families. Over half of the children living with one of these conditions are without a molecular diagnosis because of the rarity of the disease, the marked clinical heterogeneity, and the reality that there are thousands of rare diseases for which causative mutations have yet to be identified. It is in this context that in 2010 a Canadian consortium was formed to rapidly identify mutations causing a wide spectrum of pediatric-onset rare diseases by using whole-exome sequencing. The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers from across Canada. From nation-wide requests for proposals, 264 disorders were selected for study from the 371 submitted; disease-causing variants (including in 67 genes not previously associated with human disease; 41 of these have been genetically or functionally validated, and 26 are currently under study) were identified for 146 disorders over a 2-year period. Here, we present our experience with four strategies employed for gene discovery and discuss FORGE's impact in a number of realms, from clinical diagnostics to the broadening of the phenotypic spectrum of many diseases to the biological insight gained into both disease states and normal human development. Lastly, on the basis of this experience, we discuss the way forward for rare-disease genetic discovery both in Canada and internationally.

  10. A Review of Whole-Exome Sequencing Efforts Toward Hereditary Breast Cancer Susceptibility Gene Discovery.

    Science.gov (United States)

    Chandler, Madison R; Bilgili, Erin P; Merner, Nancy D

    2016-09-01

    Inherited genetic risk factors contribute toward breast cancer (BC) onset. BC risk variants can be divided into three categories of penetrance (high, moderate, and low) that reflect the probability of developing the disease. Traditional BC susceptibility gene discovery approaches that searched for high- and moderate-risk variants in familial BC cases have had limited success; to date, these risk variants explain only ∼30% of familial BC cases. Next-generation sequencing technologies can be used to search for novel high and moderate BC risk variants, and this manuscript reviews 12 familial BC whole-exome sequencing efforts. Study design, filtering strategies, and segregation and validation analyses are discussed. Overall, only a modest number of novel BC risk genes were identified, and 90% and 97% of the exome-sequenced families and cases, respectively, had no BC risk variants reported. It is important to learn from these studies and consider alternate strategies in order to make further advances. The discovery of new BC susceptibility genes is critical for improved risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies.

  11. MAGIC Database and Interfaces: An Integrated Package for Gene Discovery and Expression

    Directory of Open Access Journals (Sweden)

    Lee H. Pratt

    2006-03-01

    Full Text Available The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs, and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  12. Inherited retinal diseases in dogs: advances in gene/mutation discovery.

    Science.gov (United States)

    Miyadera, Keiko

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed.

  13. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam;

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... (ERGDB) and were analyzed from multiple aspects. We explored the possible transcription regulation mechanism by capturing highly conserved promoter motifs across orthologous genes, using promoter regions that cover the range of [-1200, +500] relative to the transcription start sites. The motif detection...... is based on ab initio discovery of common cis-elements from the orthologous gene cluster from human, mouse and rat, thus reflecting a degree of promoter sequence preservation during evolution. The identified motifs are linked to transcription factor binding sites based on the TRANSFAC database. In addition...

  14. Proxy-Based IPv6 Neighbor Discovery Scheme for Wireless LAN Based Mesh Networks

    Science.gov (United States)

    Lee, Jihoon; Jeon, Seungwoo; Kim, Jaehoon

    Multi-hop Wireless LAN-based mesh network (WMN) provides high capacity and self-configuring capabilities. Due to data forwarding and path selection based on MAC address, WMN requires additional operations to achieve global connectivity using IPv6 address. The neighbor discovery operation over WLAN mesh networks requires repeated all-node broadcasting and this gives rise to a big burden in the entire mesh networks. In this letter, we propose the proxy neighbor discovery scheme for optimized IPv6 communication over WMN to reduce network overhead and communication latency. Using simulation experiments, we show that the control overhead and communication setup latency can be significantly reduced using the proxy-based neighbor discovery mechanism.

  15. Genome-based discovery, structure prediction and functional analysis of cyclic lipopeptide antibiotics in Pseudomonas species.

    Science.gov (United States)

    de Bruijn, Irene; de Kock, Maarten J D; Yang, Meng; de Waard, Pieter; van Beek, Teris A; Raaijmakers, Jos M

    2007-01-01

    Analysis of microbial genome sequences have revealed numerous genes involved in antibiotic biosynthesis. In Pseudomonads, several gene clusters encoding non-ribosomal peptide synthetases (NRPSs) were predicted to be involved in the synthesis of cyclic lipopeptide (CLP) antibiotics. Most of these predictions, however, are untested and the association between genome sequence and biological function of the predicted metabolite is lacking. Here we report the genome-based identification of previously unknown CLP gene clusters in plant pathogenic Pseudomonas syringae strains B728a and DC3000 and in plant beneficial Pseudomonas fluorescens Pf0-1 and SBW25. For P. fluorescens SBW25, a model strain in studying bacterial evolution and adaptation, the structure of the CLP with a predicted 9-amino acid peptide moiety was confirmed by chemical analyses. Mutagenesis confirmed that the three identified NRPS genes are essential for CLP synthesis in strain SBW25. CLP production was shown to play a key role in motility, biofilm formation and in activity of SBW25 against zoospores of Phytophthora infestans. This is the first time that an antimicrobial metabolite is identified from strain SBW25. The results indicate that genome mining may enable the discovery of unknown gene clusters and traits that are highly relevant in the lifestyle of plant beneficial and plant pathogenic bacteria.

  16. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  17. Next-generation diagnostics and disease-gene discovery with the Exomiser.

    Science.gov (United States)

    Smedley, Damian; Jacobsen, Julius O B; Jäger, Marten; Köhler, Sebastian; Holtgrewe, Manuel; Schubach, Max; Siragusa, Enrico; Zemojtel, Tomasz; Buske, Orion J; Washington, Nicole L; Bone, William P; Haendel, Melissa A; Robinson, Peter N

    2015-12-01

    Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.

  18. TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

  19. INTELLIGENT SEARCH ENGINE-BASED UNIVERSAL DESCRIPTION, DISCOVERY AND INTEGRATION FOR WEB SERVICE DISCOVERY

    Directory of Open Access Journals (Sweden)

    Tamilarasi Karuppiah

    2014-01-01

    Full Text Available Web Services standard has been broadly acknowledged by industries and academic researches along with the progress of web technology and e-business. Increasing number of web applications have been bundled as web services that can be published, positioned and invoked across the web. The importance of the issues regarding their publication and innovation attains a maximum as web services multiply and become more advanced and mutually dependent. With the intension of determining the web services through effiective manner with in the minimum time period in this study proposes an UDDI with intelligent serach engine. In order to publishing and discovering web services initially, the web services are published in the UDDI registry subsequently the published web services are indexed. To improve the efficiency of discovery of web services, the indexed web services are saved as index database. The search query is compared with the index database for discovering of web services and the discovered web services are given to the service customer. The way of accessing the web services is stored in a log file, which is then utilized to provide personalized web services to the user. The finding of web service is enhanced significantly by means of an efficient exploring capability provided by the proposed system and it is accomplished of providing the maximum appropriate web service. Universal Description, Discovery and Integration (UDDI.

  20. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  1. Targeted SNP discovery in Atlantic salmon (Salmo salar genes using a 3'UTR-primed SNP detection approach

    Directory of Open Access Journals (Sweden)

    Høyheim Bjørn

    2010-12-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar has applied a global (genome-wide approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs. We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.

  2. High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome

    Directory of Open Access Journals (Sweden)

    Pappas Georgios J

    2008-06-01

    Full Text Available Abstract Background Benefits from high-throughput sequencing using 454 pyrosequencing technology may be most apparent for species with high societal or economic value but few genomic resources. Rapid means of gene sequence and SNP discovery using this novel sequencing technology provide a set of baseline tools for genome-level research. However, it is questionable how effective the sequencing of large numbers of short reads for species with essentially no prior gene sequence information will support contig assemblies and sequence annotation. Results With the purpose of generating the first broad survey of gene sequences in Eucalyptus grandis, the most widely planted hardwood tree species, we used 454 technology to sequence and assemble 148 Mbp of expressed sequences (EST. EST sequences were generated from a normalized cDNA pool comprised of multiple tissues and genotypes, promoting discovery of homologues to almost half of Arabidopsis genes, and a comprehensive survey of allelic variation in the transcriptome. By aligning the sequencing reads from multiple genotypes we detected 23,742 SNPs, 83% of which were validated in a sample. Genome-wide nucleotide diversity was estimated for 2,392 contigs using a modified theta (θ parameter, adapted for measuring genetic diversity from polymorphisms detected by randomly sequencing a multi-genotype cDNA pool. Diversity estimates in non-synonymous nucleotides were on average 4x smaller than in synonymous, suggesting purifying selection. Non-synonymous to synonymous substitutions (Ka/Ks among 2,001 contigs averaged 0.30 and was skewed to the right, further supporting that most genes are under purifying selection. Comparison of these estimates among contigs identified major functional classes of genes under purifying and diversifying selection in agreement with previous researches. Conclusion In providing an abundance of foundational transcript sequences where limited prior genomic information existed, this

  3. Effector genomics accelerates discovery and functional profiling of potato disease resistance and phytophthora infestans avirulence genes.

    Directory of Open Access Journals (Sweden)

    Vivianne G A A Vleeshouwers

    Full Text Available Potato is the world's fourth largest food crop yet it continues to endure late blight, a devastating disease caused by the Irish famine pathogen Phytophthora infestans. Breeding broad-spectrum disease resistance (R genes into potato (Solanum tuberosum is the best strategy for genetically managing late blight but current approaches are slow and inefficient. We used a repertoire of effector genes predicted computationally from the P. infestans genome to accelerate the identification, functional characterization, and cloning of potentially broad-spectrum R genes. An initial set of 54 effectors containing a signal peptide and a RXLR motif was profiled for activation of innate immunity (avirulence or Avr activity on wild Solanum species and tentative Avr candidates were identified. The RXLR effector family IpiO induced hypersensitive responses (HR in S. stoloniferum, S. papita and the more distantly related S. bulbocastanum, the source of the R gene Rpi-blb1. Genetic studies with S. stoloniferum showed cosegregation of resistance to P. infestans and response to IpiO. Transient co-expression of IpiO with Rpi-blb1 in a heterologous Nicotiana benthamiana system identified IpiO as Avr-blb1. A candidate gene approach led to the rapid cloning of S. stoloniferum Rpi-sto1 and S. papita Rpi-pta1, which are functionally equivalent to Rpi-blb1. Our findings indicate that effector genomics enables discovery and functional profiling of late blight R genes and Avr genes at an unprecedented rate and promises to accelerate the engineering of late blight resistant potato varieties.

  4. Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery

    Directory of Open Access Journals (Sweden)

    Yongyao Jiang

    2016-04-01

    Full Text Available Big geospatial data are archived and made available through online web discovery and access. However, finding the right data for scientific research and application development is still a challenge. This paper aims to improve the data discovery by mining the user knowledge from log files. Specifically, user web session reconstruction is focused upon in this paper as a critical step for extracting usage patterns. However, reconstructing user sessions from raw web logs has always been difficult, as a session identifier tends to be missing in most data portals. To address this problem, we propose two session identification methods, including time-clustering-based and time-referrer-based methods. We also present the workflow of session reconstruction and discuss the approach of selecting appropriate thresholds for relevant steps in the workflow. The proposed session identification methods and workflow are proven to be able to extract data access patterns for further pattern analyses of user behavior and improvement of data discovery for more relevancy data ranking, suggestion, and navigation.

  5. A New Algorithm of Service Discovery Based on DHT for Mobile Application

    Directory of Open Access Journals (Sweden)

    De-gan Zhang

    2011-10-01

    Full Text Available In order to solve how to enhance the discovery efficiency and coverage, based on DHT (Distributed Hash Table and Small World Theory, we put forward a new algorithm of service discovery for mobile application. In traditional DHT discovery algorithm, each node maintains the finger-table that store node information of adjacent node. By using Small-World Theory, we put forward adding a remote node into the finger-table and adding the corresponding remote index. It is different from selecting the remote connection node randomly. We select the remote connection node by calculating local node and it can assure not only the cove range of service discovery but also not increase the length of finger-table, which simplifies the calculation of the finger-table and maintenance work. The simulation proved that the algorithm can reduce the path length of service discovery effectively, improve success rate of service discovery

  6. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

    Science.gov (United States)

    Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505

  7. Leveraging gene-environment interactions and endotypes for asthma gene discovery.

    Science.gov (United States)

    Bønnelykke, Klaus; Ober, Carole

    2016-03-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs and the advantages of studying responses to asthma-associated exposures in clinical birth cohorts, as well as in cell models of GEIs, to dissect the context-specific nature of genotypic risks, to prioritize variants in genome-wide association studies, and to identify pathways involved in pathogenesis in subgroups of patients. We propose that such approaches, in spite of their many challenges, present great opportunities for better understanding of asthma pathogenesis and heterogeneity and, ultimately, for improving prevention and treatment of disease.

  8. Using heuristics to facilitate experiental learning in a simulation-based discovery learning environment

    NARCIS (Netherlands)

    Veermans, K.H.; Jong, de T.; Joolingen, van W.R.; Mason, L.; Andreuzza, S.; Arfè, B.; Favero, del L.

    2003-01-01

    Learners are often reported to experience difficulties with simulation-based discovery learning. Heuristics for discovery learning (rules of thumb that guide decision-making) can help learners to overcome these difficulties. In addition, the heuristics themselves are open for transfer. One way to in

  9. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    Science.gov (United States)

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.

  10. Evolving towards a human-cell based and multiscale approach to drug discovery for CNS disorders

    Directory of Open Access Journals (Sweden)

    Eric eSchadt

    2014-12-01

    Full Text Available A disruptive approach to therapeutic discovery and development is required in order to significantly improve the success rate of drug discovery for central nervous system (CNS disorders. In this review, we first assess the key factors contributing to the frequent clinical failures for novel drugs. Second, we discuss cancer translational research paradigms that addressed key issues in drug discovery and development and have resulted in delivering drugs with significantly improved outcomes for patients. Finally, we discuss two emerging technologies that could improve the success rate of CNS therapies: human induced pluripotent stem cell (hiPSC-based studies and multiscale biology models. Coincident with advances in cellular technologies that enable the generation of hiPSCs directly from patient blood or skin cells, together with methods to differentiate these hiPSC lines into specific neural cell types relevant to neurological disease, it is also now possible to combine data from large-scale forward genetics and post-mortem global epigenetic and expression studies in order to generate novel predictive models. The application of systems biology approaches to account for the multiscale nature of different data types, from genetic to molecular and cellular to clinical, can lead to new insights into human diseases that are emergent properties of biological networks, not the result of changes to single genes. Such studies have demonstrated the heterogeneity in etiological pathways and the need for studies on model systems that are patient-derived and thereby recapitulate neurological disease pathways with higher fidelity. In the context of two common and presumably representative neurological diseases, the neurodegenerative disease Alzheimer’s Disease (AD, and the psychiatric disorder schizophrenia (SZ, we propose the need for, and exemplify the impact of, a multiscale biology approach that can integrate panomic, clinical, imaging, and literature

  11. Gene Discovery for Synthetic Biology: Exploring the Novel Natural Product Biosynthetic Capacity of Eukaryotic Microalgae.

    Science.gov (United States)

    O'Neill, E C; Saalbach, G; Field, R A

    2016-01-01

    Eukaryotic microalgae are an incredibly diverse group of organisms whose sole unifying feature is their ability to photosynthesize. They are known for producing a range of potent toxins, which can build up during harmful algal blooms causing damage to ecosystems and fisheries. Genome sequencing is lagging behind in these organisms because of their genetic complexity, but transcriptome sequencing is beginning to make up for this deficit. As more sequence data becomes available, it is apparent that eukaryotic microalgae possess a range of complex natural product biosynthesis capabilities. Some of the genes concerned are responsible for the biosynthesis of known toxins, but there are many more for which we do not know the products. Bioinformatic and analytical techniques have been developed for natural product discovery in bacteria and these approaches can be used to extract information about the products synthesized by algae. Recent analyses suggest that eukaryotic microalgae produce many complex natural products that remain to be discovered.

  12. Biomolecular Network-Based Synergistic Drug Combination Discovery

    Directory of Open Access Journals (Sweden)

    Xiangyi Li

    2016-01-01

    Full Text Available Drug combination is a powerful and promising approach for complex disease therapy such as cancer and cardiovascular disease. However, the number of synergistic drug combinations approved by the Food and Drug Administration is very small. To bridge the gap between urgent need and low yield, researchers have constructed various models to identify synergistic drug combinations. Among these models, biomolecular network-based model is outstanding because of its ability to reflect and illustrate the relationships among drugs, disease-related genes, therapeutic targets, and disease-specific signaling pathways as a system. In this review, we analyzed and classified models for synergistic drug combination prediction in recent decade according to their respective algorithms. Besides, we collected useful resources including databases and analysis tools for synergistic drug combination prediction. It should provide a quick resource for computational biologists who work with network medicine or synergistic drug combination designing.

  13. Wi-Fi Protocol Vulnerability Discovery Based on Fuzzy Testing

    Directory of Open Access Journals (Sweden)

    Kunhua Zhu

    2013-08-01

    Full Text Available To detect the wireless network equipment whether there is protocol vulnerability, using the method of modular design and implementation of a new suitable for Wi-Fi protocol vulnerability discovery fuzzy test framework. It can be independent of its transmission medium, produce deformity packet and implementation of the attack on the target system. The author firstly describes the wireless network protocol vulnerability discovery and fuzzy test in this paper,then focused on the test frame technical scheme, detailed technical realization and so on, and its application are analyzed. In the experimental stage the fuzzy test is applied to a wireless networks gateway, the test results show that the fuzzy test framework can be well applied to the wireless network equipment agreement loophole mining work.  

  14. Finding Global Optimum for Truth Discovery: Entropy Based Geometric Variance

    OpenAIRE

    Ding, Hu; Gao, Jing; Xu, Jinhui

    2016-01-01

    Truth Discovery is an important problem arising in data analytics related fields such as data mining, database, and big data. It concerns about finding the most trustworthy information from a dataset acquired from a number of unreliable sources. Due to its importance, the problem has been extensively studied in recent years and a number techniques have already been proposed. However, all of them are of heuristic nature and do not have any quality guarantee. In this paper, we formulate the pro...

  15. Semantic Based Cluster Content Discovery in Description First Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    MUHAMMAD WASEEM KHAN

    2017-01-01

    Full Text Available In the field of data analytics grouping of like documents in textual data is a serious problem. A lot of work has been done in this field and many algorithms have purposed. One of them is a category of algorithms which firstly group the documents on the basis of similarity and then assign the meaningful labels to those groups. Description first clustering algorithm belong to the category in which the meaningful description is deduced first and then relevant documents are assigned to that description. LINGO (Label Induction Grouping Algorithm is the algorithm of description first clustering category which is used for the automatic grouping of documents obtained from search results. It uses LSI (Latent Semantic Indexing; an IR (Information Retrieval technique for induction of meaningful labels for clusters and VSM (Vector Space Model for cluster content discovery. In this paper we present the LINGO while it is using LSI during cluster label induction and cluster content discovery phase. Finally, we compare results obtained from the said algorithm while it uses VSM and Latent semantic analysis during cluster content discovery phase.

  16. Community structure discovery method based on the Gaussian kernel similarity matrix

    Science.gov (United States)

    Guo, Chonghui; Zhao, Haipeng

    2012-03-01

    Community structure discovery in complex networks is a popular issue, and overlapping community structure discovery in academic research has become one of the hot spots. Based on the Gaussian kernel similarity matrix and spectral bisection, this paper proposes a new community structure discovery method. First, by adjusting the Gaussian kernel parameter to change the scale of similarity, we can find the corresponding non-overlapping community structure when the value of the modularity is the largest relatively. Second, the changes of the Gaussian kernel parameter would lead to the unstable nodes jumping off, so with a slight change in method of non-overlapping community discovery, we can find the overlapping community nodes. Finally, synthetic data, karate club and political books datasets are used to test the proposed method, comparing with some other community discovery methods, to demonstrate the feasibility and effectiveness of this method.

  17. A Framework for Automatic Web Service Discovery Based on Semantics and NLP Techniques

    Directory of Open Access Journals (Sweden)

    Asma Adala

    2011-01-01

    Full Text Available As a greater number of Web Services are made available today, automatic discovery is recognized as an important task. To promote the automation of service discovery, different semantic languages have been created that allow describing the functionality of services in a machine interpretable form using Semantic Web technologies. The problem is that users do not have intimate knowledge about semantic Web service languages and related toolkits. In this paper, we propose a discovery framework that enables semantic Web service discovery based on keywords written in natural language. We describe a novel approach for automatic discovery of semantic Web services which employs Natural Language Processing techniques to match a user request, expressed in natural language, with a semantic Web service description. Additionally, we present an efficient semantic matching technique to compute the semantic distance between ontological concepts.

  18. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Directory of Open Access Journals (Sweden)

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  19. Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression.

    Science.gov (United States)

    Kohannim, Omid; Hibar, Derrek P; Stein, Jason L; Jahanshad, Neda; Hua, Xue; Rajagopalan, Priya; Toga, Arthur W; Jack, Clifford R; Weiner, Michael W; de Zubicaray, Greig I; McMahon, Katie L; Hansell, Narelle K; Martin, Nicholas G; Wright, Margaret J; Thompson, Paul M

    2012-01-01

    We implemented least absolute shrinkage and selection operator (LASSO) regression to evaluate gene effects in genome-wide association studies (GWAS) of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4, and CDH13. The top genes we identified with this method also displayed significant and widespread post hoc effects on voxelwise, tensor-based morphometry (TBM) maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8 ± 2.2 SD years). Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain.

  20. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    Science.gov (United States)

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q

    2016-09-02

    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  1. Pattern Discovery using Fuzzy FP-growth Algorithm from Gene Expression Data

    OpenAIRE

    Sabita Barik; Debahuti Mishra; Shruti Mishra; Sandeep Ku. Satapathy; Amiya Ku. Rath; Milu Acharya

    2010-01-01

    Abstract- The goal of microarray experiments is to identify genes that are differentially transcribed with respect to different biological conditions of cell cultures and samples. Hence, method of data analysis needs to be carefully evaluated such as clustering, classification, prediction etc. In this paper, we have proposed an efficient frequent pattern based clustering to find the gene which forms frequent patterns showing similar phenotypes leading to specific symptoms for specific disease...

  2. The Increasing Importance of Gene-Based Analyses.

    Directory of Open Access Journals (Sweden)

    Elizabeth T Cirulli

    2016-04-01

    Full Text Available In recent years, genome and exome sequencing studies have implicated a plethora of new disease genes with rare causal variants. Here, I review 150 exome sequencing studies that claim to have discovered that a disease can be caused by different rare variants in the same gene, and I determine whether their methods followed the current best-practice guidelines in the interpretation of their data. Specifically, I assess whether studies appropriately assess controls for rare variants throughout the entire gene or implicated region as opposed to only investigating the specific rare variants identified in the cases, and I assess whether studies present sufficient co-segregation data for statistically significant linkage. I find that the proportion of studies performing gene-based analyses has increased with time, but that even in 2015 fewer than 40% of the reviewed studies used this method, and only 10% presented statistically significant co-segregation data. Furthermore, I find that the genes reported in these papers are explaining a decreasing proportion of cases as the field moves past most of the low-hanging fruit, with 50% of the genes from studies in 2014 and 2015 having variants in fewer than 5% of cases. As more studies focus on genes explaining relatively few cases, the importance of performing appropriate gene-based analyses is increasing. It is becoming increasingly important for journal editors and reviewers to require stringent gene-based evidence to avoid an avalanche of misleading disease gene discovery papers.

  3. Adeno-associated virus at 50: a golden anniversary of discovery, research, and gene therapy success--a personal perspective.

    Science.gov (United States)

    Hastie, Eric; Samulski, R Jude

    2015-05-01

    Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications.

  4. Crystallographic analysis of TPP riboswitch binding by small-molecule ligands discovered through fragment-based drug discovery approaches.

    Science.gov (United States)

    Warner, Katherine Deigan; Ferré-D'Amaré, Adrian R

    2014-01-01

    Riboswitches are structured mRNA elements that regulate gene expression in response to metabolite or second-messenger binding and are promising targets for drug discovery. Fragment-based drug discovery methods have identified weakly binding small molecule "fragments" that bind a thiamine pyrophosphate (TPP) riboswitch. However, these fragments require substantial chemical elaboration into more potent, drug-like molecules. Structure determination of the fragments bound to the riboswitch is the necessary next step. In this chapter, we describe the methods for co-crystallization and structure determination of fragment-bound TPP riboswitch structures. We focus on considerations for screening crystallization conditions across multiple crystal forms and provide guidance for building the fragment into the refined crystallographic model. These methods are broadly applicable for crystallographic analyses of any small molecules that bind structured RNAs.

  5. Natural and man-made V-gene repertoires for antibody discovery.

    Science.gov (United States)

    Finlay, William J J; Almagro, Juan C

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process.

  6. Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

    Science.gov (United States)

    Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

    2016-11-28

    Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes.

  7. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data.

  8. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    KAUST Repository

    Rungrat, Tepsuda

    2016-09-09

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana.

  9. An Affinity Propagation-Based DNA Motif Discovery Algorithm

    Directory of Open Access Journals (Sweden)

    Chunxiao Sun

    2015-01-01

    Full Text Available The planted (l,d motif search (PMS is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  10. Cell-based assays in GPCR drug discovery.

    Science.gov (United States)

    Siehler, Sandra

    2008-04-01

    G protein-coupled receptors (GPCRs) transmit extracellular signals into the intracellular space, and play key roles in the physiological regulation of virtually every cell and tissue. Characteristic for the GPCR superfamily of cell surface receptors are their seven transmembrane-spanning alpha-helices, an extracellular N terminus and intracellular C-terminal tail. Besides transmission of extracellular signals, their activity is modulated by cellular signals in an auto- or transregulatory fashion. The molecular complexity of GPCRs and their regulated signaling networks triggered the interest in academic research groups to explore them further, and their drugability and role in pathophysiology triggers pharmaceutical research towards small molecular weight ligands and therapeutic antibodies. About 30% of marketed drugs target GPCRs, which underlines the importance of this target class. This review describes current and emerging cellular assays for the ligand discovery of GPCRs.

  11. Droplet-based microfluidics: enabling impact on drug discovery.

    Science.gov (United States)

    Dressler, Oliver J; Maceiczyk, Richard M; Chang, Soo-Ik; deMello, Andrew J

    2014-04-01

    Over the past two decades, the application of microengineered systems in the chemical and biological sciences has transformed the way in which high-throughput experimentation is performed. The ability to fabricate complex microfluidic architectures has allowed scientists to create new experimental formats for processing ultra-small analytical volumes in short periods and with high efficiency. The development of such microfluidic systems has been driven by a range of fundamental features that accompany miniaturization. These include the ability to handle small sample volumes, ultra-low fabrication costs, reduced analysis times, enhanced operational flexibility, facile automation, and the ability to integrate functional components within complex analytical schemes. Herein we discuss the impact of microfluidics in the area of high-throughput screening and drug discovery and highlight some of the most pertinent studies in the recent literature.

  12. An Affinity Propagation-Based DNA Motif Discovery Algorithm.

    Science.gov (United States)

    Sun, Chunxiao; Huo, Hongwei; Yu, Qiang; Guo, Haitao; Sun, Zhigang

    2015-01-01

    The planted (l, d) motif search (PMS) is one of the fundamental problems in bioinformatics, which plays an important role in locating transcription factor binding sites (TFBSs) in DNA sequences. Nowadays, identifying weak motifs and reducing the effect of local optimum are still important but challenging tasks for motif discovery. To solve the tasks, we propose a new algorithm, APMotif, which first applies the Affinity Propagation (AP) clustering in DNA sequences to produce informative and good candidate motifs and then employs Expectation Maximization (EM) refinement to obtain the optimal motifs from the candidate motifs. Experimental results both on simulated data sets and real biological data sets show that APMotif usually outperforms four other widely used algorithms in terms of high prediction accuracy.

  13. DNA microarray-based mutation discovery and genotyping.

    Science.gov (United States)

    Gresham, David

    2011-01-01

    DNA microarrays provide an efficient means of identifying single-nucleotide polymorphisms (SNPs) in DNA samples and characterizing their frequencies in individual and mixed samples. We have studied the parameters that determine the sensitivity of DNA probes to SNPs and found that the melting temperature (T (m)) of the probe is the primary determinant of probe sensitivity. An isothermal-melting temperature DNA microarray design, in which the T (m) of all probes is tightly distributed, can be implemented by varying the length of DNA probes within a single DNA microarray. I describe guidelines for designing isothermal-melting temperature DNA microarrays and protocols for labeling and hybridizing DNA samples to DNA microarrays for SNP discovery, genotyping, and quantitative determination of allele frequencies in mixed samples.

  14. Discovery of pyridine-based agrochemicals by using Intermediate Derivatization Methods.

    Science.gov (United States)

    Guan, Ai-Ying; Liu, Chang-Ling; Sun, Xu-Feng; Xie, Yong; Wang, Ming-An

    2016-02-01

    Pyridine-based compounds have been playing a crucial role as agrochemicals or pesticides including fungicides, insecticides/acaricides and herbicides, etc. Since most of the agrochemicals listed in the Pesticide Manual were discovered through screening programs that relied on trial-and-error testing and new agrochemical discovery is not benefiting as much from the in silico new chemical compound identification/discovery techniques used in pharmaceutical research, it has become more important to find new methods to enhance the efficiency of discovering novel lead compounds in the agrochemical field to shorten the time of research phases in order to meet changing market requirements. In this review, we selected 18 representative known agrochemicals containing a pyridine moiety and extrapolate their discovery from the perspective of Intermediate Derivatization Methods in the hope that this approach will have greater appeal to researchers engaged in the discovery of agrochemicals and/or pharmaceuticals.

  15. SHAPE-BASED TIME SERIES SIMILARITY MEASURE AND PATTERN DISCOVERY ALGORITHM

    Institute of Scientific and Technical Information of China (English)

    Zeng Fanzi; Qiu Zhengding; Li Dongsheng; Yue Jianhai

    2005-01-01

    Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the scale and baseline, value-based methods bring about problem when the objective is to capture the shape. Thus, a similarity measure based on shape, Sh measure, is originally proposed, andthe properties of this similarity and corresponding proofs are given. Then a time series shape pattern discovery algorithm based on Sh measure is put forward. The proposed algorithm is terminated in finite iteration with given computational and storage complexity. Finally the experiments on synthetic datasets and sunspot datasets demonstrate that the time series shape pattern algorithm is valid.

  16. A novel approach to the discovery of survival biomarkers in glioblastoma using a joint analysis of DNA methylation and gene expression.

    Science.gov (United States)

    Smith, Ashley A; Huang, Yen-Tsung; Eliot, Melissa; Houseman, E Andres; Marsit, Carmen J; Wiencke, John K; Kelsey, Karl T

    2014-06-01

    Glioblastoma multiforme (GBM) is the most aggressive of all brain tumors, with a median survival of less than 1.5 years. Recently, epigenetic alterations were found to play key roles in both glioma genesis and clinical outcome, demonstrating the need to integrate genetic and epigenetic data in predictive models. To enhance current models through discovery of novel predictive biomarkers, we employed a genome-wide, agnostic strategy to specifically capture both methylation-directed changes in gene expression and alternative associations of DNA methylation with disease survival in glioma. Human GBM-associated DNA methylation, gene expression, IDH1 mutation status, and survival data were obtained from The Cancer Genome Atlas. DNA methylation loci and expression probes were paired by gene, and their subsequent association with survival was determined by applying an accelerated failure time model to previously published alternative and expression-based association equations. Significant associations were seen in 27 unique methylation/expression pairs with expression-based, alternative, and combinatorial associations observed (10, 13, and 4 pairs, respectively). The majority of the predictive DNA methylation loci were located within CpG islands, and all but three of the locus pairs were negatively correlated with survival. This finding suggests that for most loci, methylation/expression pairs are inversely related, consistent with methylation-associated gene regulatory action. Our results indicate that changes in DNA methylation are associated with altered survival outcome through both coordinated changes in gene expression and alternative mechanisms. Furthermore, our approach offers an alternative method of biomarker discovery using a priori gene pairing and precise targeting to identify novel sites for locus-specific therapeutic intervention.

  17. Applications of Fiberoptics-Based Nanosensors to Drug Discovery

    Science.gov (United States)

    Vo-Dinh, Tuan; Scaffidi, Jonathan; Gregas, Molly; Zhang, Yan; Seewaldt, Victoria

    2013-01-01

    Background Fiber-optic nanosensors are fabricated by heating and pulling optical fibers to yield sub-micron diameter tips, and have been used for in vitro analysis of individual living mammalian cells. Immobilization of bioreceptors (e.g., antibodies, peptides, DNA, etc) selective to target analyte molecules of interest provides molecular specificity. Excitation light can be launched into the fiber, and the resulting evanescent field at the tip of the nanofiber can be used to excite target molecules bound to the bioreceptor molecules. The fluorescence or surface-enhanced Raman scattering produced by the analyte molecules is detected using an ultra-sensitive photodetector. Objective This article provides an overview of the development and application of fiber-optic nanosensors for drug discovery. Conclusions The nanosensors provide minimally invasive tools to probe sub-cellular compartments inside single living cells for health effect studies (e.g., detection of benzopyrene adducts) and medical applications (e.g., monitoring of apoptosis in cells treated with anti-cancer drugs). PMID:23496274

  18. Informatics-Based Discovery of Disease-Associated Immune Profiles

    Science.gov (United States)

    Delmas, Amber; Oikonomopoulos, Angelos; Lacey, Precious N.; Fallahi, Mohammad; Hommes, Daniel W.; Sundrud, Mark S.

    2016-01-01

    Advances in flow and mass cytometry are enabling ultra-high resolution immune profiling in mice and humans on an unprecedented scale. However, the resulting high-content datasets challenge traditional views of cytometry data, which are both limited in scope and biased by pre-existing hypotheses. Computational solutions are now emerging (e.g., Citrus, AutoGate, SPADE) that automate cell gating or enable visualization of relative subset abundance within healthy versus diseased mice or humans. Yet these tools require significant computational fluency and fail to show quantitative relationships between discrete immune phenotypes and continuous disease variables. Here we describe a simple informatics platform that uses hierarchical clustering and nearest neighbor algorithms to associate manually gated immune phenotypes with clinical or pre-clinical disease endpoints of interest in a rapid and unbiased manner. Using this approach, we identify discrete immune profiles that correspond with either weight loss or histologic colitis in a T cell transfer model of inflammatory bowel disease (IBD), and show distinct nodes of immune dysregulation in the IBDs, Crohn’s disease and ulcerative colitis. This streamlined informatics approach for cytometry data analysis leverages publicly available software, can be applied to manually or computationally gated cytometry data, is suitable for any clinical or pre-clinical setting, and embraces ultra-high content flow and mass cytometry as a discovery engine. PMID:27669154

  19. Improving pattern discovery and visualization of SAGE data through poisson-based self-adaptive neural networks.

    Science.gov (United States)

    Zheng, Huiru; Wang, Haiying; Azuaje, Francisco

    2008-07-01

    Serial analysis of gene expression (SAGE) allows a detailed, simultaneous analysis of thousands of genes without the need for prior, complete gene sequence information. However, due to its inherent complexity and the lack of complete structural and function knowledge, mining vast collections of SAGE data to extract useful knowledge poses great challenges to traditional analytical techniques. Moreover, SAGE data are characterized by a specific statistical model that has not been incorporated into traditional data analysis techniques. The analysis of SAGE data requires advanced, intelligent computational techniques, which consider the underlying biology and the statistical nature of SAGE data. By addressing the statistical properties demonstrated by SAGE data, this paper presents a new self-adaptive neural network, Poisson-based growing self-organizing map (PGSOM), which implements novel weight adaptation and neuron growing strategies. An empirical study of key dynamic mechanisms of PGSOM is presented. It was tested on three datasets, including synthetic and experimental SAGE data. The results indicate that, in comparison to traditional techniques, the PGSOM offers significant advantages in the context of pattern discovery and visualization in SAGE data. The pattern discovery and visualization platform discussed in this paper can be applied to other problem domains where the data are better approximated by a Poisson distribution.

  20. Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation

    Directory of Open Access Journals (Sweden)

    Robin Stephane

    2009-03-01

    Full Text Available Abstract Background The use of current high-throughput genetic, genomic and post-genomic data leads to the simultaneous evaluation of a large number of statistical hypothesis and, at the same time, to the multiple-testing problem. As an alternative to the too conservative Family-Wise Error-Rate (FWER, the False Discovery Rate (FDR has appeared for the last ten years as more appropriate to handle this problem. However one drawback of FDR is related to a given rejection region for the considered statistics, attributing the same value to those that are close to the boundary and those that are not. As a result, the local FDR has been recently proposed to quantify the specific probability for a given null hypothesis to be true. Results In this context we present a semi-parametric approach based on kernel estimators which is applied to different high-throughput biological data such as patterns in DNA sequences, genes expression and genome-wide association studies. Conclusion The proposed method has the practical advantages, over existing approaches, to consider complex heterogeneities in the alternative hypothesis, to take into account prior information (from an expert judgment or previous studies by allowing a semi-supervised mode, and to deal with truncated distributions such as those obtained in Monte-Carlo simulations. This method has been implemented and is available through the R package kerfdr via the CRAN or at http://stat.genopole.cnrs.fr/software/kerfdr.

  1. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    Science.gov (United States)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  2. Discovery of genes involved with learning and memory: an experimental synthesis of Hirschian and Benzerian perspectives.

    Science.gov (United States)

    Tully, T

    1996-11-26

    The biological bases of learning and memory are being revealed today with a wide array of molecular approaches, most of which entail the analysis of dysfunction produced by gene disruptions. This perspective derives both from early "genetic dissections" of learning in mutant Drosophila by Seymour Benzer and colleagues and from earlier behavior-genetic analyses of learning and in Diptera by Jerry Hirsh and coworkers. Three quantitative-genetic insights derived from these latter studies serve as guiding principles for the former. First, interacting polygenes underlie complex traits. Consequently, learning/memory defects associated with single-gene mutants can be quantified accurately only in equilibrated, heterogeneous genetic backgrounds. Second, complex behavioral responses will be composed of genetically distinct functional components. Thus, genetic dissection of complex traits into specific biobehavioral properties is likely. Finally, disruptions of genes involved with learning/memory are likely to have pleiotropic effects. As a result, task-relevant sensorimotor responses required for normal learning must be assessed carefully to interpret performance in learning/memory experiments. In addition, more specific conclusions will be obtained from reverse-genetic experiments, in which gene disruptions are restricted in time and/or space.

  3. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Directory of Open Access Journals (Sweden)

    Adam Y Ye

    Full Text Available Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD (http://htd.cbi.pku.edu.cn. Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  4. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Science.gov (United States)

    Ye, Adam Y; Liu, Qing-Rong; Li, Chuan-Yun; Zhao, Min; Qu, Hong

    2014-01-01

    Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD) (http://htd.cbi.pku.edu.cn). Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  5. A P2P Service Discovery Strategy Based on Content Catalogues

    Directory of Open Access Journals (Sweden)

    Lican Huang

    2007-08-01

    Full Text Available This paper presents a framework for distributed service discovery based on VIRGO P2P technologies. The services are classified as multi-layer, hierarchical catalogue domains according to their contents. The service providers, which have their own service registries such as UDDIs, register the services they provide and establish a virtual tree in a VIRGO network according to the domain of their service. The service location done by the proposed strategy is effective and guaranteed. This paper also discusses the primary implementation of service discovery based on Tomcat/Axis and jUDDI.

  6. Modelling and enhanced molecular dynamics to steer structure-based drug discovery.

    Science.gov (United States)

    Kalyaanamoorthy, Subha; Chen, Yi-Ping Phoebe

    2014-05-01

    The ever-increasing gap between the availabilities of the genome sequences and the crystal structures of proteins remains one of the significant challenges to the modern drug discovery efforts. The knowledge of structure-dynamics-functionalities of proteins is important in order to understand several key aspects of structure-based drug discovery, such as drug-protein interactions, drug binding and unbinding mechanisms and protein-protein interactions. This review presents a brief overview on the different state of the art computational approaches that are applied for protein structure modelling and molecular dynamics simulations of biological systems. We give an essence of how different enhanced sampling molecular dynamics approaches, together with regular molecular dynamics methods, assist in steering the structure based drug discovery processes.

  7. Accelerated Discovery in Photocatalysis using a Mechanism-Based Screening Method.

    Science.gov (United States)

    Hopkinson, Matthew N; Gómez-Suárez, Adrián; Teders, Michael; Sahoo, Basudev; Glorius, Frank

    2016-03-18

    Herein, we report a conceptually novel mechanism-based screening approach to accelerate discovery in photocatalysis. In contrast to most screening methods, which consider reactions as discrete entities, this approach instead focuses on a single constituent mechanistic step of a catalytic reaction. Using luminescence spectroscopy to investigate the key quenching step in photocatalytic reactions, an initial screen of 100 compounds led to the discovery of two promising substrate classes. Moreover, a second, more focused screen provided mechanistic insights useful in developing proof-of-concept reactions. Overall, this fast and straightforward approach both facilitated the discovery and aided the development of new light-promoted reactions and suggests that mechanism-based screening strategies could become useful tools in the hunt for new reactivity.

  8. Web-services-based resource discovery model and service deployment on HealthGrids.

    Science.gov (United States)

    Naseer, Aisha; Stergioulas, Lampros K

    2010-05-01

    HealthGrids represent the next generation of advanced healthcare IT and hold the promise to untangle complex healthcare-data problems by integrating health information systems and healthcare entities. Healthcare could benefit from a new delivery approach using HealthGrids to better meet the biomedical and health-related needs. Specialized services are needed to provide unified discovery of and ubiquitous access to available HealthGrid resources. The different types of services available on HealthGrids are classified into two levels, the operational-level services and the management-level services. This paper takes a fresh approach to address the problems of resource discovery in HealthGrids based on Web services (WS) and WS technologies and proposes a WS-based resource discovery model.

  9. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

    Directory of Open Access Journals (Sweden)

    Steinfeld Israel

    2009-02-01

    Full Text Available Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. Results GOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression. GOrilla employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, GOrilla computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms. Conclusion GOrilla is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. GOrilla's unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. GOrilla is publicly available at: http://cbl-gorilla.cs.technion.ac.il

  10. Affinity-Based Screening Technology and HCV Drug Discovery

    Institute of Scientific and Technical Information of China (English)

    LI Bin

    2003-01-01

    @@ NS5A is one of the non-structural gene products encoded by Hepatitis C virus (HCV) and related viruses that are essential for viral replication. The amino acid sequence of NS5A is conserved between different HCV genotypes and the primary amino acid sequence of NS5A is unique to HCV and closely related viruses. Importantly, NS5A is unrelated to any human protein. This indicates that drugs designed to block the actions of NS5A could inhibit the replication of HCV without showing toxic side effects in human host cells, thus making NS5A inhibitors ideal anti-viral drugs. However, there are presently no functional assays for this essential viral protein. Therefore, conventional high throughput screening (HTS) approaches can not be used to discover antiviral drugs against NS5A.

  11. New construction for expert system based on innovative knowledge discovery technology

    Institute of Scientific and Technical Information of China (English)

    YANG BingRu; SONG Wei; XU ZhangYan

    2007-01-01

    Knowledge acquisition is the bottleneck of expert system. To solve this problem, KD (D&K), which is a comprehensive knowledge discovery process model cooperating both database and knowledge base, and related technology are proposed. Then based on KD (D&K) and related technology, the new construction of Expert System based on Knowledge Discovery (ESKD) is proposed. As the key knowledge acquisition component of ESKD, KD (D&K) is composed of KDD* and KDK*. KDD*-the new process model based on double bases cooperating mechanism; KDK*- the new process model based on double-basis fusion mechanism are introduced, respectively. The overall framework of ESKD is proposed. Some sub-systems and dynamic knowledge base system are discussed. Finally, the effectiveness and advantages of ESKD are tested in a real-world agriculture database. We hope that ESKD may be useful for the new generation of expert systems.

  12. Active PDP Discovery for the Policy Based MANET Management

    Science.gov (United States)

    Song, Wang-Cheol; Rehman, Shafqat-Ur; Lee, Kyung-Jin; Lutfiyya, Hanan

    A Policy-based Network Management (PBNM) in Mobile Ad-hoc Networks (MANETs) should be efficient and reliable. In this letter, we propose a mechanism for the policy-based management in ad hoc networks and we discuss methods to discover the Policy Decision Point (PDP), set the management area, and manage the movements of nodes in the PBNM system. Finally, we assess the results through simulations.

  13. Structure-and-mechanism-based design and discovery of therapeutics for cocaine overdose and addiction.

    Science.gov (United States)

    Zheng, Fang; Zhan, Chang-Guo

    2008-03-07

    (-)-Cocaine is a widely abused drug and there is currently no available anti-cocaine therapeutic. Promising agents, such as anti-cocaine catalytic antibodies and high-activity mutants of human butyrylcholinesterase (BChE), for therapeutic treatment of cocaine overdose have been developed through structure-and-mechanism-based design and discovery. In particular, a unique computational design strategy based on the modeling and simulation of the rate-determining transition state has been developed and used to design and discover desirable high-activity mutants of BChE. One of the discovered high-activity mutants of BChE has a approximately 456-fold improved catalytic efficiency against (-)-cocaine. The encouraging outcome of the structure-and-mechanism-based design and discovery effort demonstrates that the unique computational design approach based on transition state modeling and simulation is promising for rational enzyme redesign and drug discovery. The general approach of the structure-and-mechanism-based design and discovery may be used to design high-activity mutants of any enzyme or catalytic antibody.

  14. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  15. Discovery of technical methanation catalysts based on computational screening

    DEFF Research Database (Denmark)

    Sehested, Jens; Larsen, Kasper Emil; Kustov, Arkadii

    2007-01-01

    Methanation is a classical reaction in heterogeneous catalysis and significant effort has been put into improving the industrially preferred nickel-based catalysts. Recently, a computational screening study showed that nickel-iron alloys should be more active than the pure nickel catalyst...

  16. Microwave-Assisted Esterification: A Discovery-Based Microscale Laboratory Experiment

    Science.gov (United States)

    Reilly, Maureen K.; King, Ryan P.; Wagner, Alexander J.; King, Susan M.

    2014-01-01

    An undergraduate organic chemistry laboratory experiment has been developed that features a discovery-based microscale Fischer esterification utilizing a microwave reactor. Students individually synthesize a unique ester from known sets of alcohols and carboxylic acids. Each student identifies the best reaction conditions given their particular…

  17. Infrared and Raman Spectroscopy: A Discovery-Based Activity for the General Chemistry Curriculum

    Science.gov (United States)

    Borgsmiller, Karen L.; O'Connell, Dylan J.; Klauenberg, Kathryn M.; Wilson, Peter M.; Stromberg, Christopher J.

    2012-01-01

    A discovery-based method is described for incorporating the concepts of IR and Raman spectroscopy into the general chemistry curriculum. Students use three sets of springs to model the properties of single, double, and triple covalent bonds. Then, Gaussian 03W molecular modeling software is used to illustrate the relationship between bond…

  18. Knowledge Discovery Based on Grid%基于网格的知识发现

    Institute of Scientific and Technical Information of China (English)

    张丽芳

    2009-01-01

    On the basis of introduction of knowledge discovery on the grid, the basic principle and components of knowledge discovery or the grid is proposed and a novel framework of knowledge discovery on the grid is designed. Then the process of centralized data mining and distributed data mining based on the architecture is analyzed. And the future work is proposed at last.%该文在介绍网格知识发现概念的基础上,提出了网格知识发现架构设计的基本原则和组件,设计了一种新型的网格知识发现框架,并在此架构上分析了集中式数据挖掘和分布式数据挖掘的全过程,最后给出了工作展望.

  19. Discovery of a 29-gene panel in peripheral blood mononuclear cells for the detection of colorectal cancer and adenomas using high throughput real-time PCR.

    Science.gov (United States)

    Ciarloni, Laura; Hosseinian, Sahar; Monnier-Benoit, Sylvain; Imaizumi, Natsuko; Dorta, Gian; Ruegg, Curzio

    2015-01-01

    Colorectal cancer (CRC) is the second leading cause of cancer-related death in developed countries. Early detection of CRC leads to decreased CRC mortality. A blood-based CRC screening test is highly desirable due to limited invasiveness and high acceptance rate among patients compared to currently used fecal occult blood testing and colonoscopy. Here we describe the discovery and validation of a 29-gene panel in peripheral blood mononuclear cells (PBMC) for the detection of CRC and adenomatous polyps (AP). Blood samples were prospectively collected from a multicenter, case-control clinical study. First, we profiled 93 samples with 667 candidate and 3 reference genes by high throughput real-time PCR (OpenArray system). After analysis, 160 genes were retained and tested again on 51 additional samples. Low expressed and unstable genes were discarded resulting in a final dataset of 144 samples profiled with 140 genes. To define which genes, alone or in combinations had the highest potential to discriminate AP and/or CRC from controls, data were analyzed by a combination of univariate and multivariate methods. A list of 29 potentially discriminant genes was compiled and evaluated for its predictive accuracy by penalized logistic regression and bootstrap. This method discriminated AP >1cm and CRC from controls with a sensitivity of 59% and 75%, respectively, with 91% specificity. The behavior of the 29-gene panel was validated with a LightCycler 480 real-time PCR platform, commonly adopted by clinical laboratories. In this work we identified a 29-gene panel expressed in PBMC that can be used for developing a novel minimally-invasive test for accurate detection of AP and CRC using a standard real-time PCR platform.

  20. Link-based quantitative methods to identify differentially coexpressed genes and gene Pairs

    Directory of Open Access Journals (Sweden)

    Ye Zhi-Qiang

    2011-08-01

    Full Text Available Abstract Background Differential coexpression analysis (DCEA is increasingly used for investigating the global transcriptional mechanisms underlying phenotypic changes. Current DCEA methods mostly adopt a gene connectivity-based strategy to estimate differential coexpression, which is characterized by comparing the numbers of gene neighbors in different coexpression networks. Although it simplifies the calculation, this strategy mixes up the identities of different coexpression neighbors of a gene, and fails to differentiate significant differential coexpression changes from those trivial ones. Especially, the correlation-reversal is easily missed although it probably indicates remarkable biological significance. Results We developed two link-based quantitative methods, DCp and DCe, to identify differentially coexpressed genes and gene pairs (links. Bearing the uniqueness of exploiting the quantitative coexpression change of each gene pair in the coexpression networks, both methods proved to be superior to currently popular methods in simulation studies. Re-mining of a publicly available type 2 diabetes (T2D expression dataset from the perspective of differential coexpression analysis led to additional discoveries than those from differential expression analysis. Conclusions This work pointed out the critical weakness of current popular DCEA methods, and proposed two link-based DCEA algorithms that will make contribution to the development of DCEA and help extend it to a broader spectrum.

  1. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    Science.gov (United States)

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  2. Knowledge discovery based on experiential learning corporate culture management

    Science.gov (United States)

    Tu, Kai-Jan

    2014-10-01

    A good corporate culture based on humanistic theory can make the enterprise's management very effective, all enterprise's members have strong cohesion and centripetal force. With experiential learning model, the enterprise can establish an enthusiastic learning spirit corporate culture, have innovation ability to gain the positive knowledge growth effect, and to meet the fierce global marketing competition. A case study on Trend's corporate culture can offer the proof of industry knowledge growth rate equation as the contribution to experiential learning corporate culture management.

  3. Data Mining based Software Development Communication Pattern Discovery

    Directory of Open Access Journals (Sweden)

    Gang Zhang

    2010-12-01

    Full Text Available Smaller time loss and smoother communication pattern is the urgent pursuit in the software development enterprise. However, communication is difficult to control and manage and demands on technical support, due to the uncertainty and complex structure of data appeared in communication. Data mining is a well established framework aiming at intelligently discovering knowledge and principles hidden in massive amounts of original data. Data mining technology together with shared repositories results in an intelligent way to analyze data of communication in software development environment. We propose a data mining based algorithm to tackle the problem, adopting a co-training styled algorithm to discover pattern in software development environment. Decision tree is trained as based learners and a majority voting procedure is then launched to determine labels of unlabeled data. Based learners are then trained again with newly labeled data and such iteration stops when a consistent state is reached. Our method is naturally semi-supervised which can improve generalization ability by making use of unlabeled data. Experimental results on data set gathered from productive environment indicate that the proposed algorithm is effective and outperforms traditional supervised algorithms.

  4. Ligand-based receptor tyrosine kinase partial agonists: New paradigm for cancer drug discovery?

    Science.gov (United States)

    Riese, David J.

    2010-01-01

    Introduction Receptor tyrosine kinases (RTKs) are validated targets for oncology drug discovery and several RTK antagonists have been approved for the treatment of human malignancies. Nonetheless, the discovery and development of RTK antagonists has lagged behind the discovery and development of agents that target G-protein coupled receptors. In part, this is because it has been difficult to discover analogs of naturally-occurring RTK agonists that function as antagonists. Areas covered Here we describe ligands of ErbB receptors that function as partial agonists for these receptors, thereby enabling these ligands to antagonize the activity of full agonists for these receptors. We provide insights into the mechanisms by which these ligands function as antagonists. We discuss how information concerning these mechanisms can be translated into screens for novel small molecule- and antibody-based antagonists of ErbB receptors and how such antagonists hold great potential as targeted cancer chemotherapeutics. Expert opinion While there have been a number of important key findings into this field, the identification of the structural basis of ligand functional specificity is still of the greatest importance. While it is true that, with some notable exceptions, peptide hormones and growth factors have not proven to be good platforms for oncology drug discovery; addressing the fundamental issues of antagonistic partial agonists for receptor tyrosine kinases has the potential to steer oncology drug discovery in new directions. Mechanism based approaches are now emerging to enable the discovery of RTK partial agonists that may antagonize both agonist-dependent and –independent RTK signaling and may hold tremendous promise as targeted cancer chemotherapeutics. PMID:21532939

  5. Kuder-Richardson Coefficient Based Trust Mechanism for Service Discovery in MANETs

    Directory of Open Access Journals (Sweden)

    S. Pariselvam

    2013-10-01

    Full Text Available Service discovery in Mobile Ad hoc networks is highly crucial due to the lack of centralized infrastructure. Furthermore, varieties of services available through the network may require different levels of security. Thus, a need arises for formulating and deploying an efficient trust oriented service discovery mechanism distributed in each and every node in the ad hoc scenario in order to reduce thecomplexity in providing the services to the network users. In this paper, we have proposed a Kuder-Richardson co-efficient based trust mechanism (KRCBM for service discovery in MANET. This effective mechanism works with the aid of the trust value called Kuder-Richardson co-efficient, which manipulates the reliability of the group of nodes participating in the ad hoc environment. This trust model possesses an inherent ability of designating various protection levels for services discovery. Based on the designated level of the services, secure communication is established. The performances of KRCBM are analyzed through ns-2 simulations with the help of performance metrics like packet delivery ratio, ControlOverhead and total overhead. From the simulation results obtained, it is proved that the proposed mechanism performs well when compared to the other trust based mechanisms available in the literature by reducing the packet drops to a maximum extent.

  6. Microfluidic-Based Multi-Organ Platforms for Drug Discovery

    Directory of Open Access Journals (Sweden)

    Ahmad Rezaei Kolahchi

    2016-09-01

    Full Text Available Development of predictive multi-organ models before implementing costly clinical trials is central for screening the toxicity, efficacy, and side effects of new therapeutic agents. Despite significant efforts that have been recently made to develop biomimetic in vitro tissue models, the clinical application of such platforms is still far from reality. Recent advances in physiologically-based pharmacokinetic and pharmacodynamic (PBPK-PD modeling, micro- and nanotechnology, and in silico modeling have enabled single- and multi-organ platforms for investigation of new chemical agents and tissue-tissue interactions. This review provides an overview of the principles of designing microfluidic-based organ-on-chip models for drug testing and highlights current state-of-the-art in developing predictive multi-organ models for studying the cross-talk of interconnected organs. We further discuss the challenges associated with establishing a predictive body-on-chip (BOC model such as the scaling, cell types, the common medium, and principles of the study design for characterizing the interaction of drugs with multiple targets.

  7. Online Discovery of Search Objectives for Test-Based Problems.

    Science.gov (United States)

    Liskowski, Paweł; Krawiec, Krzysztof

    2016-03-08

    In test-based problems, commonly approached with competitive coevolutionary algorithms, the fitness of a candidate solution is determined by the outcomes of its interactions with multiple tests. Usually, fitness is a scalar aggregate of interaction outcomes, and as such imposes a complete order on the candidate solutions. However, passing different tests may require unrelated "skills," and candidate solutions may vary with respect to such capabilities. In this study, we provide theoretical evidence that scalar fitness, inherently incapable of capturing such differences, is likely to lead to premature convergence. To mitigate this problem, we propose DISCO, a method that automatically identifies the groups of tests for which the candidate solutions behave similarly and define the above skills. Each such group gives rise to a derived objective, and these objectives together guide the search algorithm in multi-objective fashion. When applied to several well-known test-based problems, the proposed approach significantly outperforms the conventional two-population coevolution. This opens the door to efficient and generic countermeasures to premature convergence for both coevolutionary and evolutionary algorithms applied to problems featuring aggregating fitness functions.

  8. Knowledge Discovery for Event Series Decision Based on Rough Set

    Institute of Scientific and Technical Information of China (English)

    ZENG Chuan-hua; PEI Zheng; XU Yang

    2006-01-01

    To make decisions about event series is part of our life, and to discover knowledge from these decisions is of great significance in the field of controlling and decision-making.The paper takes event series as the exterior form of movements with the dynamic attributes, and gets the Markov transition probabilities matrix to express those attributes with statistics. First, according to the matrix,the decision table is constructed. Then, by reducing attributes based on rough set theory, the decision table is reduced, and the decision rules are acquired as well. Finally we make the decision through defining rule distance and taking the minimum rule distance as decision principle.Followed is an example, which proves that the algorithm is feasible and effective to the event series decision.

  9. SDAA: Towards Service Discovery Anywhere Anytime Mobile Based Application

    Directory of Open Access Journals (Sweden)

    Mehedi Masud

    2016-01-01

    Full Text Available Providing on-demand service based on customers' current location is an urgent need for many societies and individuals. Specially, for woman, elderly people, single mother, sick people, etc. Considering the need of providing localized services, this paper proposes a mobile application framework that allows an individual to receive services from his neighborhood peers anywhere anytime. The application allows an individual to find and select reliable service providers near his location. The application will provide an opportunity to the interested individuals to use their free time for providing services to the community and earn some extra money. This application will benefit many stakeholders like elderly people, women at home, a person while traveling in an unknown place, etc. A prototype application is developed and empirical evaluation is considered to find the qualitative measures of the users' acceptability and satisfaction of the application. It is observed that users' satisfaction is high.

  10. Network-based discovery through mechanistic systems biology. Implications for applications--SMEs and drug discovery: where the action is.

    Science.gov (United States)

    Benson, Neil

    2015-08-01

    Phase II attrition remains the most important challenge for drug discovery. Tackling the problem requires improved understanding of the complexity of disease biology. Systems biology approaches to this problem can, in principle, deliver this. This article reviews the reports of the application of mechanistic systems models to drug discovery questions and discusses the added value. Although we are on the journey to the virtual human, the length, path and rate of learning from this remain an open question. Success will be dependent on the will to invest and make the most of the insight generated along the way.

  11. Meiosis-specific gene discovery in plants: RNA-Seq applied to isolated Arabidopsis male meiocytes

    Directory of Open Access Journals (Sweden)

    May Gregory D

    2010-12-01

    Full Text Available Abstract Background Meiosis is a critical process in the reproduction and life cycle of flowering plants in which homologous chromosomes pair, synapse, recombine and segregate. Understanding meiosis will not only advance our knowledge of the mechanisms of genetic recombination, but also has substantial applications in crop improvement. Despite the tremendous progress in the past decade in other model organisms (e.g., Saccharomyces cerevisiae and Drosophila melanogaster, the global identification of meiotic genes in flowering plants has remained a challenge due to the lack of efficient methods to collect pure meiocytes for analyzing the temporal and spatial gene expression patterns during meiosis, and for the sensitive identification and quantitation of novel genes. Results A high-throughput approach to identify meiosis-specific genes by combining isolated meiocytes, RNA-Seq, bioinformatic and statistical analysis pipelines was developed. By analyzing the studied genes that have a meiosis function, a pipeline for identifying meiosis-specific genes has been defined. More than 1,000 genes that are specifically or preferentially expressed in meiocytes have been identified as candidate meiosis-specific genes. A group of 55 genes that have mitochondrial genome origins and a significant number of transposable element (TE genes (1,036 were also found to have up-regulated expression levels in meiocytes. Conclusion These findings advance our understanding of meiotic genes, gene expression and regulation, especially the transcript profiles of MGI genes and TE genes, and provide a framework for functional analysis of genes in meiosis.

  12. FPGA-Based Pulse Parameter Discovery for Positron Emission Tomography.

    Science.gov (United States)

    Haselman, Michael; Hauck, Scott; Lewellen, Thomas K; Miyaoka, Robert S

    2009-10-24

    Modern Field Programmable Gate Arrays (FPGAs) are capable of performing complex digital signal processing algorithms with clock rates well above 100MHz. This, combined with FPGA's low expense and ease of use make them an ideal technology for a data acquisition system for a positron emission tomography (PET) scanner. The University of Washington is producing a series of high-resolution, small-animal PET scanners that utilize FPGAs as the core of the front-end electronics. For these next generation scanners, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilizes to add significant signal processing power to produce higher resolution images. In this paper we report how we utilize the reconfigurable property of an FPGA to self-calibrate itself to determine pulse parameters necessary for some of the pulse processing steps. Specifically, we show how the FPGA can generate a reference pulse based on actual pulse data instead of a model. We also report how other properties of the photodetector pulse (baseline, pulse length, average pulse energy and event triggers) can be determined automatically by the FPGA.

  13. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Science.gov (United States)

    Kahle, Juliette J; Souroullas, George P; Yu, Peng; Zohren, Fabian; Lee, Yoontae; Shaw, Chad A; Zoghbi, Huda Y; Goodell, Margaret A

    2013-03-01

    Hematopoietic stem cells (HSCs) are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L) was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  14. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Directory of Open Access Journals (Sweden)

    Juliette J Kahle

    2013-03-01

    Full Text Available Hematopoietic stem cells (HSCs are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  15. Gene based therapies for kidney regeneration

    NARCIS (Netherlands)

    Janssen, Manoe J; Arcolino, Fanny O; Schoor, Perry; Kok, Robbert Jan; Mastrobattista, Enrico

    2016-01-01

    In this review we provide an overview of the expanding molecular toolbox that is available for gene based therapies and how these therapies can be used for a large variety of kidney diseases. Gene based therapies range from restoring gene function in genetic kidney diseases to steering complex molec

  16. An Agent-Based Focused Crawling Framework for Topic- and Genre-Related Web Document Discovery

    OpenAIRE

    Pappas, Nikolaos; Katsimpras, Georgios; Stamatatos, Efstathios

    2012-01-01

    The discovery of web documents about certain topics is an important task for web-based applications including web document retrieval, opinion mining and knowledge extraction. In this paper, we propose an agent-based focused crawling framework able to retrieve topic- and genre-related web documents. Starting from a simple topic query, a set of focused crawler agents explore in parallel topic-specific web paths using dynamic seed URLs that belong to certain web genres and are collected from web...

  17. Simulation-based Discovery of Cyclic Peptide Nanotubes

    Science.gov (United States)

    Ruiz Pestana, Luis A.

    Today, there is a growing need for environmentally friendly synthetic membranes with selective transport capabilities to address some of society's most pressing issues, such as carbon dioxide pollution, or access to clean water. While conventional membranes cannot stand up to the challenge, thin nanocomposite membranes, where vertically aligned subnanometer pores (e.g. nanotubes) are embedded in a thin polymeric film, promise to overcome some of the current limitations, namely, achieving a monodisperse distribution of subnanometer size pores, vertical pore alignment across the membrane thickness, and tunability of the pore surface chemistry. Self-assembled cyclic peptide nanotubes (CPNs), are particularly promising as selective nanopores because the pore size can be controlled at the subnanometer level, exhibit high chemical design flexibility, and display remarkable mechanical stability. In addition, when conjugated with polymer chains, the cyclic peptides can co-assemble in block copolymer domains to form nanoporous thin films. CPNs are thus well positioned to tackle persistent challenges in molecular separation applications. However, our poor understanding of the physics underlying their remarkable properties prevents the rational design and implementation of CPNs in technologically relevant membranes. In this dissertation, we use a simulation-based approach, in particular molecular dynamics (MD) simulations, to investigate the critical knowledge gaps hindering the implementation of CPNs. Computational mechanical tests show that, despite the weak nature of the stabilizing hydrogen bonds and the small cross section, CPNs display a Young's modulus of approximately 20 GPa and a maximum strength of around 1 GPa, placing them among the strongest proteinaceous materials known. Simulations of the self-assembly process reveal that CPNs grow by self-similar coarsening, contrary to other low-dimensional peptide systems, such as amyloids, that are believed to grow through

  18. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes.

    Directory of Open Access Journals (Sweden)

    Ying Tong

    Full Text Available The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs.The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.. Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs and 1,699 simple sequence repeats (SSRs were compiled.Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research on bivalve

  19. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases.

    Directory of Open Access Journals (Sweden)

    Rocio Chavez-Alvarez

    Full Text Available DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.

  20. Computational strategies for genome-based natural product discovery and engineering in fungi.

    Science.gov (United States)

    van der Lee, Theo A J; Medema, Marnix H

    2016-04-01

    Fungal natural products possess biological activities that are of great value to medicine, agriculture and manufacturing. Recent metagenomic studies accentuate the vastness of fungal taxonomic diversity, and the accompanying specialized metabolic diversity offers a great and still largely untapped resource for natural product discovery. Although fungal natural products show an impressive variation in chemical structures and biological activities, their biosynthetic pathways share a number of key characteristics. First, genes encoding successive steps of a biosynthetic pathway tend to be located adjacently on the chromosome in biosynthetic gene clusters (BGCs). Second, these BGCs are often are located on specific regions of the genome and show a discontinuous distribution among evolutionarily related species and isolates. Third, the same enzyme (super)families are often involved in the production of widely different compounds. Fourth, genes that function in the same pathway are often co-regulated, and therefore co-expressed across various growth conditions. In this mini-review, we describe how these partly interlinked characteristics can be exploited to computationally identify BGCs in fungal genomes and to connect them to their products. Particular attention will be given to novel algorithms to identify unusual classes of BGCs, as well as integrative pan-genomic approaches that use a combination of genomic and metabolomic data for parallelized natural product discovery across multiple strains. Such novel technologies will not only expedite the natural product discovery process, but will also allow the assembly of a high-quality toolbox for the re-design or even de novo design of biosynthetic pathways using synthetic biology approaches.

  1. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org.

  2. Generalization-based discovery of spatial association rules with linguistic cloud models

    Institute of Scientific and Technical Information of China (English)

    杨斌; 田永青; 朱仲英

    2004-01-01

    Extraction of interesting and general spatial association rules from large spatial databases is an important task in the development of spatial database systems. In this paper, we investigate the generalization-based knowledge discovery mechanism that integrates attribute-oriented induction on nonspatial data and spatial merging and generalization on spatial data. Furthermore, we present linguistic cloud models for knowledge representation and uncertainty handling to enhance current generalization-based method. With these models, spatial and nonspatial attribute values are well generalized at higher-concept levels, allowing discovery of strong spatial association rules. Combining the cloud model based generalization method with Apriori algorithm for mining association rules from a spatial database shows the benefits in effectiveness and flexibility.

  3. Discovery of sequence motifs related to coexpression of genes using evolutionary computation

    OpenAIRE

    Fogel, Gary B.; Weekes, Dana G.; Varga, Gabor; Dow, Ernst R.; Harlow, Harry B.; Onyia, Jude E.; Su, Chen

    2004-01-01

    Transcription factors are key regulatory elements that control gene expression. Recognition of transcription factor binding site (TFBS) motifs in the upstream region of coexpressed genes is therefore critical towards a true understanding of the regulations of gene expression. The task of discovering eukaryotic TFBSs remains a challenging problem. Here, we demonstrate that evolutionary computation can be used to search for TFBSs in upstream regions of genes known to be coexpressed. Evolutionar...

  4. Discovery and analysis of inflammatory disease-related genes using cDNA microarrays

    OpenAIRE

    1997-01-01

    cDNA microarray technology is used to profile complex diseases and discover novel disease-related genes. In inflammatory disease such as rheumatoid arthritis, expression patterns of diverse cell types contribute to the pathology. We have monitored gene expression in this disease state with a microarray of selected human genes of probable significance in inflammation as well as with genes expressed in peripheral human blood cells. Messenger RNA from cultured macrophages, chondrocyte cell lines...

  5. Update of the Gene Discovery Program in Schistosoma mansoni with the Expressed Sequence Tag Approach

    Directory of Open Access Journals (Sweden)

    Élida ML Rabelo

    1997-09-01

    Full Text Available Continuing the Schistosoma mansoni Genome Project 363 new templates were sequenced generating 205 more ESTs corresponding to 91 genes. Seventy four of these genes (81% had not previously been described in S. mansoni. Among the newly discovered genes there are several of significant biological interest such as synaptophysin, NIFs-like and rho-GDP dissociation inhibitor

  6. Discovery by the Epistasis Project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer's disease

    NARCIS (Netherlands)

    J.M. Bullock (James); C. Medway (Christopher); M. Cortina-Borja (Mario); J.C. Turton (James); J.A. Prince (Jonathan); C.A. Ibrahim-Verbaas (Carla); M. Schuur (Maaike); M.M.B. Breteler (Monique); C.M. van Duijn (Cock); P.G. Kehoe (Patrick); R. Barber (Rachel); E. Coto (Eliecer); V. Alvarez (Victoria); P. Deloukas (Panagiotis); N. Hammond (Naomi); O. Combarros (Onofre); I. Mateo (Ignacio); D.R. Warden (Donald); M.G. Lehmann (Michael); O. Belbin (Olivia); K. Brown (Kristelle); G.K. Wilcock (Gordon); R. Heun (Reinhard); H. Kölsch (Heike); A.D. Smith; D.J. Lehmann (Donald); K. Morgan (Kevin)

    2013-01-01

    textabstractDespite recent discoveries in the genetics of sporadic Alzheimer's disease, there remains substantial " hidden heritability." It is thought that some of this missing heritability may be because of gene-gene, i.e., epistatic, interactions. We examined potential epistasis between 110 candi

  7. An integrative approach to species discovery in odonates: from character-based DNA barcoding to ecology.

    Science.gov (United States)

    Damm, Sandra; Schierwater, Bernd; Hadrys, Heike

    2010-09-01

    Modern taxonomy requires an analytical approach incorporating all lines of evidence into decision-making. Such an approach can enhance both species identification and species discovery. The character-based DNA barcode method provides a molecular data set that can be incorporated into classical taxonomic data such that the discovery of new species can be made in an analytical framework that includes multiple sources of data. We here illustrate such a corroborative framework in a dragonfly model system that permits the discovery of two new, but visually cryptic species. In the African dragonfly genus Trithemis three distinct genetic clusters can be detected which could not be identified by using classical taxonomic characters. In order to test the hypothesis of two new species, DNA-barcodes from different sequence markers (ND1 and COI) were combined with morphological, ecological and biogeographic data sets. Phylogenetic analyses and incorporation of all data sets into a scheme called taxonomic circle highly supports the hypothesis of two new species. Our case study suggests an analytical approach to modern taxonomy that integrates data sets from different disciplines, thereby increasing the ease and reliability of both species discovery and species assignment.

  8. Discovery of CTCF-sensitive Cis-spliced fusion RNAs between adjacent genes in human prostate cells.

    Directory of Open Access Journals (Sweden)

    Fujun Qin

    2015-02-01

    Full Text Available Genes or their encoded products are not expected to mingle with each other unless in some disease situations. In cancer, a frequent mechanism that can produce gene fusions is chromosomal rearrangement. However, recent discoveries of RNA trans-splicing and cis-splicing between adjacent genes (cis-SAGe support for other mechanisms in generating fusion RNAs. In our transcriptome analyses of 28 prostate normal and cancer samples, 30% fusion RNAs on average are the transcripts that contain exons belonging to same-strand neighboring genes. These fusion RNAs may be the products of cis-SAGe, which was previously thought to be rare. To validate this finding and to better understand the phenomenon, we used LNCaP, a prostate cell line as a model, and identified 16 additional cis-SAGe events by silencing transcription factor CTCF and paired-end RNA sequencing. About half of the fusions are expressed at a significant level compared to their parental genes. Silencing one of the in-frame fusions resulted in reduced cell motility. Most out-of-frame fusions are likely to function as non-coding RNAs. The majority of the 16 fusions are also detected in other prostate cell lines, as well as in the 14 clinical prostate normal and cancer pairs. By studying the features associated with these fusions, we developed a set of rules: 1 the parental genes are same-strand-neighboring genes; 2 the distance between the genes is within 30kb; 3 the 5' genes are actively transcribing; and 4 the chimeras tend to have the second-to-last exon in the 5' genes joined to the second exon in the 3' genes. We then randomly selected 20 neighboring genes in the genome, and detected four fusion events using these rules in prostate cancer and non-cancerous cells. These results suggest that splicing between neighboring gene transcripts is a rather frequent phenomenon, and it is not a feature unique to cancer cells.

  9. Discovery and characterization of the first genuine avian leptin gene in the rock dove (Columba livia).

    Science.gov (United States)

    Friedman-Einat, Miriam; Cogburn, Larry A; Yosefi, Sara; Hen, Gideon; Shinder, Dmitry; Shirak, Andrey; Seroussi, Eyal

    2014-09-01

    Leptin, the key regulator of mammalian energy balance, has been at the center of a great controversy in avian biology for the last 15 years since initial reports of a putative leptin gene (LEP) in chickens. Here, we characterize a novel LEP in rock dove (Columba livia) with low similarity of the predicted protein sequence (30% identity, 47% similarity) to the human ortholog. Searching the Sequence-Read-Archive database revealed leptin transcripts, in the dove's liver, with 2 noncoding exons preceding 2 coding exons. This unusual 4-exon structure was validated by sequencing of a GC-rich product (76% GC, 721 bp) amplified from liver RNA by RT-PCR. Sequence alignment of the dove leptin with orthologous leptins indicated that it consists of a leader peptide (21 amino acids; aa) followed by the mature protein (160 aa), which has a putative structure typical of 4-helical-bundle cytokines except that it is 12 aa longer than human leptin. Extra residues (10 aa) were located within the loop between 2 5'-helices, interrupting the amino acid motif that is conserved in tetrapods and considered essential for activation of leptin receptor (LEPR) but not for receptor binding per se. Quantitative RT-PCR of 11 tissues showed highest (P < .05) expression of LEP in the dove's liver, whereas the dove LEPR peaked (P < .01) in the pituitary. Both genes were prominently expressed in the gonads and at lower levels in tissues involved in mammalian leptin signaling (adipose; hypothalamus). A bioassay based on activation of the chicken LEPR in vitro showed leptin activity in the dove's circulation, suggesting that dove LEP encodes an active protein, despite the interrupted loop motif. Providing tools to study energy-balance control at an evolutionary perspective, our original demonstration of leptin signaling in dove predicts a more ancient role of leptin in growth and reproduction in birds, rather than appetite control.

  10. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach

    Directory of Open Access Journals (Sweden)

    Chung-Ming Own

    2015-09-01

    Full Text Available Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS, which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  11. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach.

    Science.gov (United States)

    Own, Chung-Ming; Meng, Zhaopeng; Liu, Kehan

    2015-09-03

    Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs) and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS), which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  12. A One-Bead-One-Catalyst Approach to Aspartic Acid-Based Oxidation Catalyst Discovery

    Science.gov (United States)

    Lichtor, Phillip A.; Miller, Scott J.

    2011-01-01

    We report an approach to the high-throughput screening of asymmetric oxidation catalysts. The strategy is based on application of the one-bead-one-compound library approach, wherein each of our catalyst candidates is based on a peptide scaffold. For this purpose we rely on a recently developed catalytic cycle that employs an acid-peracid shuttle. In order to implement our approach, we developed a compatible linker and demonstrated that the library format is amenable to screening and sequencing of catalysts employing partial Edman degradation and MALDI mass spectrometry analysis. The system was applied to the discovery (and re-discovery) of catalysts for the enantioselective oxidation of a cyclohexene derivative. The system is now poised for application to unprecedented substrate classes for asymmetric oxidation reactions. PMID:21417485

  13. Resource discovery algorithm based on hierarchical model and Conscious search in Grid computing system

    Directory of Open Access Journals (Sweden)

    Nasim Nickbakhsh

    2017-03-01

    Full Text Available The distributed system of Grid subscribes the non-homogenous sources at a vast level in a dynamic manner. The resource discovery manner is very influential on the efficiency and of quality the system functionality. The “Bitmap” model is based on the hierarchical and conscious search model that allows for less traffic and low number of messages in relation to other methods in this respect. This proposed method is based on the hierarchical and conscious search model that enhances the Bitmap method with the objective to reduce traffic, reduce the load of resource management processing, reduce the number of emerged messages due to resource discovery and increase the resource according speed. The proposed method and the Bitmap method are simulated through Arena tool. This proposed model is abbreviated as RNTL.

  14. Discovery and analysis of pancreatic adenocarcinoma genes using cDNA microarrays

    Institute of Scientific and Technical Information of China (English)

    Gang Jin; Xian-Gui Hu; Kang Ying; Yan Tang; Rui Liu; Yi-Jie Zhang; Zai-Ping Jing; Yi Xie; Yu-Min Mao

    2005-01-01

    AIM: To study the pathogenetic processes and the role of gene expression by microarray analyses in expediting our understanding of the molecular pathophysiology of pancreatic adenocarcinoma, and to identify the novel cancer-associated genes.METHODS: Nine histologically defined pancreatic head adenocarcinoma specimens associated with clinical data were studied. Total RNA and mRNA were isolated and labeled by reverse transcription reaction with Cy5 and Cy3 for cDNA probe. The cDNA microarrays that represent a set of 4 096 human genes were hybridized with labeled cDNA probe and screened for molecular profiling analyses.RESULTS: Using this methodology, 184 genes were screened out for differences in gene expression level after nine couples of hybridizations. Of the 184 genes,87 were upregulated and 97 downregulated, including 11 novel human genes. In pancreatic adenocarcinoma tissue, several invasion and metastasis related genes showed their high expression levels, suggesting that poor prognosis of pancreatic adenocarcinoma might have a solid molecular biological basis.CONCLUSION: The application of cDNA microarray technique for analysis of gene expression patterns is a powerful strategy to identify novel cancer-associated genes, and to rapidly explore their role in clinical pancreatic adenocarcinoma. Microarray profiles provide us new insights into the carcinogenesis and invasive process of pancreatic adenocarcinoma. Our results suggest that a highly organized and structured process of tumor invasion exists in the pancreas.

  15. Target-based drug discovery for human African trypanosomiasis: selection of molecular target and chemical matter.

    Science.gov (United States)

    Gilbert, Ian H

    2014-01-01

    Target-based approaches for human African trypanosomiasis (HAT) and related parasites can be a valuable route for drug discovery for these diseases. However, care needs to be taken in selection of both the actual drug target and the chemical matter that is developed. In this article, potential criteria to aid target selection are described. Then the physiochemical properties of typical oral drugs are discussed and compared to those of known anti-parasitics.

  16. Physiologically based pharmacokinetic modeling in drug discovery and development: a pharmaceutical industry perspective.

    Science.gov (United States)

    Jones, H M; Chen, Y; Gibson, C; Heimbach, T; Parrott, N; Peters, S A; Snoeys, J; Upreti, V V; Zheng, M; Hall, S D

    2015-03-01

    The application of physiologically based pharmacokinetic (PBPK) modeling has developed rapidly within the pharmaceutical industry and is becoming an integral part of drug discovery and development. In this study, we provide a cross pharmaceutical industry position on "how PBPK modeling can be applied in industry" focusing on the strategies for application of PBPK at different stages, an associated perspective on the confidence and challenges, as well as guidance on interacting with regulatory agencies and internal best practices.

  17. Research on Hotspot Discovery in Internet Public Opinions Based on Improved K-Means

    OpenAIRE

    Gensheng Wang

    2013-01-01

    How to discover hotspot in the Internet public opinions effectively is a hot research field for the researchers related which plays a key role for governments and corporations to find useful information from mass data in the Internet. An improved K-means algorithm for hotspot discovery in internet public opinions is presented based on the analysis of existing defects and calculation principle of original K-means algorithm. First, some new methods are designed to preprocess website texts, sele...

  18. Gene based therapies for kidney regeneration.

    Science.gov (United States)

    Janssen, Manoe J; Arcolino, Fanny O; Schoor, Perry; Kok, Robbert Jan; Mastrobattista, Enrico

    2016-11-05

    In this review we provide an overview of the expanding molecular toolbox that is available for gene based therapies and how these therapies can be used for a large variety of kidney diseases. Gene based therapies range from restoring gene function in genetic kidney diseases to steering complex molecular pathways in chronic kidney disorders, and can provide a treatment or cure for diseases that otherwise may not be targeted. This approach involves the delivery of recombinant DNA sequences harboring therapeutic genes to improve cell function and thereby promote kidney regeneration. Depending on the therapy, the recombinant DNA will express a gene that directly plays a role in the function of the cell (gene addition), that regulates the expression of an endogenous gene (gene regulation), or that even changes the DNA sequence of endogenous genes (gene editing). Some interventions involve permanent changes in the genome whereas others are only temporary and leave no trace. Efficient and safe delivery are important steps for all gene based therapies and also depend on the mode of action of the therapeutic gene. Here we provide examples on how the different methods can be used to treat various diseases, which technologies are now emerging (such as gene repair through CRISPR/Cas9) and what the opportunities, perspectives, potential and the limitations of these therapies are for the treatment of kidney diseases.

  19. Discovery of sequence motifs related to coexpression of genes using evolutionary computation

    Science.gov (United States)

    Fogel, Gary B.; Weekes, Dana G.; Varga, Gabor; Dow, Ernst R.; Harlow, Harry B.; Onyia, Jude E.; Su, Chen

    2004-01-01

    Transcription factors are key regulatory elements that control gene expression. Recognition of transcription factor binding site (TFBS) motifs in the upstream region of coexpressed genes is therefore critical towards a true understanding of the regulations of gene expression. The task of discovering eukaryotic TFBSs remains a challenging problem. Here, we demonstrate that evolutionary computation can be used to search for TFBSs in upstream regions of genes known to be coexpressed. Evolutionary computation was used to search for TFBSs of genes regulated by octamer-binding factor and nuclear factor kappa B. The discovered binding sites included experimentally determined known binding motifs as well as lists of putative, previously unknown TFBSs. We believe that this method to search nucleotide sequence information efficiently for similar motifs will be useful for discovering TFBSs that affect gene regulation. PMID:15266008

  20. Complementary Approaches to Existing Target Based Drug Discovery for Identifying Novel Drug Targets

    Directory of Open Access Journals (Sweden)

    Suhas Vasaikar

    2016-11-01

    Full Text Available In the past decade, it was observed that the relationship between the emerging New Molecular Entities and the quantum of R&D investment has not been favorable. There might be numerous reasons but few studies stress the introduction of target based drug discovery approach as one of the factors. Although a number of drugs have been developed with an emphasis on a single protein target, yet identification of valid target is complex. The approach focuses on an in vitro single target, which overlooks the complexity of cell and makes process of validation drug targets uncertain. Thus, it is imperative to search for alternatives rather than looking at success stories of target-based drug discovery. It would be beneficial if the drugs were developed to target multiple components. New approaches like reverse engineering and translational research need to take into account both system and target-based approach. This review evaluates the strengths and limitations of known drug discovery approaches and proposes alternative approaches for increasing efficiency against treatment.

  1. Structure-Based Drug Discovery for Prion Disease Using a Novel Binding Simulation

    Directory of Open Access Journals (Sweden)

    Daisuke Ishibashi

    2016-07-01

    Full Text Available The accumulation of abnormal prion protein (PrPSc converted from the normal cellular isoform of PrP (PrPC is assumed to induce pathogenesis in prion diseases. Therefore, drug discovery studies for these diseases have focused on the protein conversion process. We used a structure-based drug discovery algorithm (termed Nagasaki University Docking Engine: NUDE that ran on an intensive supercomputer with a graphic-processing unit to identify several compounds with anti-prion effects. Among the candidates showing a high-binding score, the compounds exhibited direct interaction with recombinant PrP in vitro, and drastically reduced PrPSc and protein-aggresomes in the prion-infected cells. The fragment molecular orbital calculation showed that the van der Waals interaction played a key role in PrPC binding as the intermolecular interaction mode. Furthermore, PrPSc accumulation and microgliosis were significantly reduced in the brains of treated mice, suggesting that the drug candidates provided protection from prion disease, although further in vivo tests are needed to confirm these findings. This NUDE-based structure-based drug discovery for normal protein structures is likely useful for the development of drugs to treat other conformational disorders, such as Alzheimer's disease.

  2. Fragment Based Strategies for Discovery of Novel HIV-1 Reverse Transcriptase and Integrase Inhibitors.

    Science.gov (United States)

    Latham, Catherine F; La, Jennifer; Tinetti, Ricky N; Chalmers, David K; Tachedjian, Gilda

    2016-01-01

    Human immunodeficiency virus (HIV) remains a global health problem. While combined antiretroviral therapy has been successful in controlling the virus in patients, HIV can develop resistance to drugs used for treatment, rendering available drugs less effective and limiting treatment options. Initiatives to find novel drugs for HIV treatment are ongoing, although traditional drug design approaches often focus on known binding sites for inhibition of established drug targets like reverse transcriptase and integrase. These approaches tend towards generating more inhibitors in the same drug classes already used in the clinic. Lack of diversity in antiretroviral drug classes can result in limited treatment options, as cross-resistance can emerge to a whole drug class in patients treated with only one drug from that class. A fresh approach in the search for new HIV-1 drugs is fragment-based drug discovery (FBDD), a validated strategy for drug discovery based on using smaller libraries of low molecular weight molecules (<300 Da) screened using primarily biophysical assays. FBDD is aimed at not only finding novel drug scaffolds, but also probing the target protein to find new, often allosteric, inhibitory binding sites. Several fragment-based strategies have been successful in identifying novel inhibitory sites or scaffolds for two proven drug targets for HIV-1, reverse transcriptase and integrase. While any FBDD-generated HIV-1 drugs have yet to enter the clinic, recent FBDD initiatives against these two well-characterised HIV-1 targets have reinvigorated antiretroviral drug discovery and the search for novel classes of HIV-1 drugs.

  3. PDMAEMA based gene delivery materials

    Directory of Open Access Journals (Sweden)

    Seema Agarwal

    2012-09-01

    Full Text Available Gene transfection is the transfer of genetic material like DNA into cells. Cationic polymers which form nanocomplexes with DNA, so-called non-viral gene vectors, are a highly promising platform for efficient gene transfection. Despite intensive research efforts and some of the on-going clinical trials on gene transfection, none of the existing cationic polymer systems are generally acceptable for human gene therapy. Since the process of gene transfection is complex and puts different challenges and demands on the delivery system, there is a strong requirement for the design and development of a multifunctional system in a simple way. This review will discuss recent efforts in design, synthesis, and performance of poly(2-dimethylaminoethyl methacrylate (PDMAEMA nanocomplexes with DNA.

  4. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery(F).

    Science.gov (United States)

    Ono, Nadia Nicole; Britton, Monica Therese; Fass, Joseph Nathaniel; Nicolet, Charles Meyer; Lin, Dawei; Tian, Li

    2011-10-01

    Pomegranate fruit peel is rich in bioactive plant natural products, such as hydrolyzable tannins and anthocyanins. Despite their documented roles in human nutrition and fruit quality, genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain. Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform. Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp). Candidate genes for hydrolyzable tannin, anthocyanin, flavonoid, terpenoid and fatty acid biosynthesis and/or regulation were identified. Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts. In addition, 115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers. The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate. This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis, identifying genes controlling important agronomic traits, and discovering molecular markers in non-model specialty crop species.

  5. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery

    Institute of Scientific and Technical Information of China (English)

    Nadia Nicole Ono; Monica Therese Britton; Joseph Nathaniel Fass; Charles Meyer Nicolet; Dawei Lin; Li Tian

    2011-01-01

    Pomegranate fruit peel is rich in bioactive plant natural products,such as hydrolyzable tannins and anthocyanins.Despite their documented roles in human nutrition and fruit quality,genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain.Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform.Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp).Candidate genes for hydrolyzable tannin,anthocyanin,flavonoid,terpenoid and fatty acid biosynthesis and/or regulation were identified.Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts.In addition,115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers.The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate.This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis,identifying genes controlling important agronomic traits,and discovering molecular markers in non-model specialty crop species.

  6. Stability-based comparison of class discovery methods for DNA copy number profiles.

    Directory of Open Access Journals (Sweden)

    Isabel Brito

    Full Text Available MOTIVATION: Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input data representations for array-CGH, dissimilarity measures between tumor samples and clustering algorithms may be used for this purpose. The choice between procedures is often difficult. An evaluation procedure is therefore required to select the best class discovery method (combination of one input data representation, one dissimilarity measure and one clustering algorithm for array-CGH. Robustness of the resulting classes is a common requirement, but no stability-based comparison of class discovery methods for array-CGH profiles has ever been reported. RESULTS: We applied several class discovery methods and evaluated the stability of their solutions, with a modified version of Bertoni's [Formula: see text]-based test [1]. Our version relaxes the assumption of independency required by original Bertoni's [Formula: see text]-based test. We conclude that Minimal Regions of alteration (a concept introduced by [2] for input data representation, sim [3] or agree [4] for dissimilarity measure and the use of average group distance in the clustering algorithm produce the most robust classes of array-CGH profiles. AVAILABILITY: The software is available from http://bioinfo.curie.fr/projects/cgh-clustering. It has also been partly integrated into "Visualization and analysis of array-CGH"(VAMP[5]. The data sets used are publicly available from ACTuDB [6].

  7. ETS gene fusions in prostate cancer: from discovery to daily clinical practice.

    NARCIS (Netherlands)

    Tomlins, S.A.; Bjartell, A.; Chinnaiyan, A.M.; Jenster, G.; Nam, R.K.; Rubin, M.A.; Schalken, J.A.

    2009-01-01

    CONTEXT: In 2005, fusions between the androgen-regulated transmembrane protease serine 2 gene, TMPRSS2, and E twenty-six (ETS) transcription factors were discovered in prostate cancer. OBJECTIVE: To review advances in our understanding of ETS gene fusions, focusing on challenges affecting translatio

  8. Discovery of mitochondrial chimeric-gene associated with cytoplasmic male sterility of HL-rice

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The mitochondrial genome libraries of HL-type sterile line(A) and maintainer line(B) have been constructed.Mitochondrial gene, atp6, was used to screen libraries, due to the different Southern and Northern blot results between sterile and maintainer line. Sequencing analysis of positive clones proved that there were two copies of atp6 gene in sterile line and only one in maintainer line. One copy of atpt6 in sterile line was same to that in maintainer line; the other showed different flanking sequence from the 49th nucleotide downstream of the termination codon of atp6 gene. A new chimeric gene, orfH79, was found in the region. OrfH79 had homology to mitochondrial gene coxⅡ and orfl07, and was special to HL-sterile cytoplasm.``

  9. Correlating overrepresented upstream motifs to gene expression a computational approach to regulatory element discovery in eukaryotes

    CERN Document Server

    Caselle, M; Provero, P

    2002-01-01

    Gene regulation in eukaryotes is mainly effected through transcription factors binding to rather short recognition motifs generally located upstream of the coding region. We present a novel computational method to identify regulatory elements in the upstream region of eukaryotic genes. The genes are grouped in sets sharing an overrepresented short motif in their upstream sequence. For each set, the average expression level from a microarray experiment is determined: If this level is significantly higher or lower than the average taken over the whole genome, then the overerpresented motif shared by the genes in the set is likely to play a role in their regulation. The method was tested by applying it to the genome of Saccharomyces cerevisiae, using the publicly available results of a DNA microarray experiment, in which expression levels for virtually all the genes were measured during the diauxic shift from fermentation to respiration. Several known motifs were correctly identified, and a new candidate regulat...

  10. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products.

    Science.gov (United States)

    Gubbens, Jacob; Zhu, Hua; Girard, Geneviève; Song, Lijiang; Florea, Bogdan I; Aston, Philip; Ichinose, Koji; Filippov, Dmitri V; Choi, Young H; Overkleeft, Herman S; Challis, Gregory L; van Wezel, Gilles P

    2014-06-19

    Information on gene clusters for natural product biosynthesis is accumulating rapidly because of the current boom of available genome sequencing data. However, linking a natural product to a specific gene cluster remains challenging. Here, we present a widely applicable strategy for the identification of gene clusters for specific natural products, which we name natural product proteomining. The method is based on using fluctuating growth conditions that ensure differential biosynthesis of the bioactivity of interest. Subsequent combination of metabolomics and quantitative proteomics establishes correlations between abundance of natural products and concomitant changes in the protein pool, which allows identification of the relevant biosynthetic gene cluster. We used this approach to elucidate gene clusters for different natural products in Bacillus and Streptomyces, including a novel juglomycin-type antibiotic. Natural product proteomining does not require prior knowledge of the gene cluster or secondary metabolite and therefore represents a general strategy for identification of all types of gene clusters.

  11. Discovery of putative capsaicin biosynthetic genes by RNA-Seq and digital gene expression analysis of pepper

    Science.gov (United States)

    Zhang, Zi-Xin; Zhao, Shu-Niu; Liu, Gao-Feng; Huang, Zu-Mei; Cao, Zhen-Mu; Cheng, Shan-Han; Lin, Shi-Sen

    2016-01-01

    The Indian pepper ‘Guijiangwang’ (Capsicum frutescens L.), one of the world’s hottest chili peppers, is rich in capsaicinoids. The accumulation of the alkaloid capsaicin and its analogs in the epidermal cells of the placenta contribute to the pungency of Capsicum fruits. To identify putative genes involved in capsaicin biosynthesis, RNA-Seq was used to analyze the pepper’s expression profiles over five developmental stages. Five cDNA libraries were constructed from the total RNA of placental tissue and sequenced using an Illumina HiSeq 2000. More than 19 million clean reads were obtained from each library, and greater than 50% of the reads were assignable to reference genes. Digital gene expression (DGE) profile analysis using Solexa sequencing was performed at five fruit developmental stages and resulted in the identification of 135 genes of known function; their expression patterns were compared to the capsaicin accumulation pattern. Ten genes of known function were identified as most likely to be involved in regulating capsaicin synthesis. Additionally, 20 new candidate genes were identified related to capsaicin synthesis. We use a combination of RNA-Seq and DGE analyses to contribute to the understanding of the biosynthetic regulatory mechanism(s) of secondary metabolites in a nonmodel plant and to identify candidate enzyme-encoding genes. PMID:27756914

  12. Contributions of computational chemistry and biophysical techniques to fragment-based drug discovery.

    Science.gov (United States)

    Gozalbes, Rafael; Carbajo, Rodrigo J; Pineda-Lucena, Antonio

    2010-01-01

    In the last decade, fragment-based drug discovery (FBDD) has evolved from a novel approach in the search of new hits to a valuable alternative to the high-throughput screening (HTS) campaigns of many pharmaceutical companies. The increasing relevance of FBDD in the drug discovery universe has been concomitant with an implementation of the biophysical techniques used for the detection of weak inhibitors, e.g. NMR, X-ray crystallography or surface plasmon resonance (SPR). At the same time, computational approaches have also been progressively incorporated into the FBDD process and nowadays several computational tools are available. These stretch from the filtering of huge chemical databases in order to build fragment-focused libraries comprising compounds with adequate physicochemical properties, to more evolved models based on different in silico methods such as docking, pharmacophore modelling, QSAR and virtual screening. In this paper we will review the parallel evolution and complementarities of biophysical techniques and computational methods, providing some representative examples of drug discovery success stories by using FBDD.

  13. Enhancing service discovery using cat swarm optimisation based web service clustering

    Directory of Open Access Journals (Sweden)

    Sunaina Kotekar

    2016-09-01

    Full Text Available Web service discovery is a critical task in service oriented application development. Due to extensive proliferation in the number of available services, it is challenging to obtain all the relevant services available for a given task. For the retrieval of most relevant Web services, a user would have to use those service-specific terms that best describe and match the natural language documentation contained within a service description. This process can be time intensive, due to functional diversity of available services in a repository. Domain specific clustering of Web Services based on the similarities of their functionalities would greatly boost the ability of a Web service search engine to retrieve the most relevant service. In this paper, we propose a novel technique to cluster service documents into functionally similar service groups using the Cat Swarm Optimisation Algorithm. We present experimental results that show that the proposed technique was effective and enhanced the process of service discovery.

  14. Fragment-based discovery of potent inhibitors of the anti-apoptotic MCL-1 protein.

    Science.gov (United States)

    Petros, Andrew M; Swann, Steven L; Song, Danying; Swinger, Kerren; Park, Chang; Zhang, Haichao; Wendt, Michael D; Kunzer, Aaron R; Souers, Andrew J; Sun, Chaohong

    2014-03-15

    Apoptosis is regulated by the BCL-2 family of proteins, which is comprised of both pro-death and pro-survival members. Evasion of apoptosis is a hallmark of malignant cells. One way in which cancer cells achieve this evasion is thru overexpression of the pro-survival members of the BCL-2 family. Overexpression of MCL-1, a pro-survival protein, has been shown to be a resistance factor for Navitoclax, a potent inhibitor of BCL-2 and BCL-XL. Here we describe the use of fragment screening methods and structural biology to drive the discovery of novel MCL-1 inhibitors from two distinct structural classes. Specifically, cores derived from a biphenyl sulfonamide and salicylic acid were uncovered in an NMR-based fragment screen and elaborated using high throughput analog synthesis. This culminated in the discovery of selective and potent inhibitors of MCL-1 that may serve as promising leads for medicinal chemistry optimization efforts.

  15. Research on Hotspot Discovery in Internet Public Opinions Based on Improved -Means

    Directory of Open Access Journals (Sweden)

    Gensheng Wang

    2013-01-01

    Full Text Available How to discover hotspot in the Internet public opinions effectively is a hot research field for the researchers related which plays a key role for governments and corporations to find useful information from mass data in the Internet. An improved -means algorithm for hotspot discovery in internet public opinions is presented based on the analysis of existing defects and calculation principle of original -means algorithm. First, some new methods are designed to preprocess website texts, select and express the characteristics of website texts, and define the similarity between two website texts, respectively. Second, clustering principle and the method of initial classification centers selection are analyzed and improved in order to overcome the limitations of original -means algorithm. Finally, the experimental results verify that the improved algorithm can improve the clustering stability and classification accuracy of hotspot discovery in internet public opinions when used in practice.

  16. Research on hotspot discovery in internet public opinions based on improved K-means.

    Science.gov (United States)

    Wang, Gensheng

    2013-01-01

    How to discover hotspot in the Internet public opinions effectively is a hot research field for the researchers related which plays a key role for governments and corporations to find useful information from mass data in the Internet. An improved K-means algorithm for hotspot discovery in internet public opinions is presented based on the analysis of existing defects and calculation principle of original K-means algorithm. First, some new methods are designed to preprocess website texts, select and express the characteristics of website texts, and define the similarity between two website texts, respectively. Second, clustering principle and the method of initial classification centers selection are analyzed and improved in order to overcome the limitations of original K-means algorithm. Finally, the experimental results verify that the improved algorithm can improve the clustering stability and classification accuracy of hotspot discovery in internet public opinions when used in practice.

  17. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    Science.gov (United States)

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence.

  18. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    Directory of Open Access Journals (Sweden)

    Robertson Gordon

    2010-10-01

    Full Text Available Abstract Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE. Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.

  19. Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data

    Directory of Open Access Journals (Sweden)

    Gruber Stephen B

    2005-02-01

    Full Text Available Abstract Background A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tightly controlled calibration data sets as the basis for comparison. Here we compare performances for seven processing methods using two data sets originally collected for disease profiling studies. An emphasis is placed on understanding sensitivity for detecting differentially expressed genes in terms of two key statistical determinants: test statistic variability for non-differentially expressed genes, and test statistic size for truly differentially expressed genes. Results In the two data sets considered here, up to seven-fold variation across the processing methods was found in the number of genes detected at a given false discovery rate (FDR. The best performing methods called up to 90% of the same genes differentially expressed, had less variable test statistics under randomization, and had a greater number of large test statistics in the experimental data. Poor performance of one method was directly tied to a tendency to produce highly variable test statistic values under randomization. Based on an overall measure of performance, two of the seven methods (Dchip and a trimmed mean approach are superior in the two data sets considered here. Two other methods (MAS5 and GCRMA-EB are inferior, while results for the other three methods are mixed. Conclusions Choice of processing method has a major impact on differential expression analysis of microarray data. Previously reported performance analyses using tightly controlled calibration data sets are not highly consistent with results reported here using data from human tissue samples. Performance of array processing methods in disease profiling and other realistic biological studies should be

  20. Semantic MEDLINE for discovery browsing: using semantic predications and the literature-based discovery paradigm to elucidate a mechanism for the obesity paradox.

    Science.gov (United States)

    Cairelli, Michael J; Miller, Christopher M; Fiszman, Marcelo; Workman, T Elizabeth; Rindflesch, Thomas C

    2013-01-01

    Applying the principles of literature-based discovery (LBD), we elucidate the paradox that obesity is beneficial in critical care despite contributing to disease generally. Our approach enhances a previous extension to LBD, called "discovery browsing," and is implemented using Semantic MEDLINE, which summarizes the results of a PubMed search into an interactive graph of semantic predications. The methodology allows a user to construct argumentation underpinning an answer to a biomedical question by engaging the user in an iterative process between system output and user knowledge. Components of the Semantic MEDLINE output graph identified as "interesting" by the user both contribute to subsequent searches and are constructed into a logical chain of relationships constituting an explanatory network in answer to the initial question. Based on this methodology we suggest that phthalates leached from plastic in critical care interventions activate PPAR gamma, which is anti-inflammatory and abundant in obese patients.

  1. Discovery and characterization of novel vascular and hematopoietic genes downstream of etsrp in zebrafish.

    Directory of Open Access Journals (Sweden)

    Gustavo A Gomez

    Full Text Available The transcription factor Etsrp is required for vasculogenesis and primitive myelopoiesis in zebrafish. When ectopically expressed, etsrp is sufficient to induce the expression of many vascular and myeloid genes in zebrafish. The mammalian homolog of etsrp, ER71/Etv2, is also essential for vascular and hematopoietic development. To identify genes downstream of etsrp, gain-of-function experiments were performed for etsrp in zebrafish embryos followed by transcription profile analysis by microarray. Subsequent in vivo expression studies resulted in the identification of fourteen genes with blood and/or vascular expression, six of these being completely novel. Regulation of these genes by etsrp was confirmed by ectopic induction in etsrp overexpressing embryos and decreased expression in etsrp deficient embryos. Additional functional analysis of two newly discovered genes, hapln1b and sh3gl3, demonstrates their importance in embryonic vascular development. The results described here identify a group of genes downstream of etsrp likely to be critical for vascular and/or myeloid development.

  2. Genome-wide discovery of Pax7 target genes during development.

    Science.gov (United States)

    White, Robert B; Ziman, Melanie R

    2008-03-14

    Pax7 plays critical roles in development of brain, spinal cord, neural crest, and skeletal muscle. As a sequence-specific DNA-binding transcription factor, any direct functional role played by Pax7 during development is mediated through target gene selection. Thus, we have sought to identify genes targeted by Pax7 during embryonic development using an unbiased chromatin immunoprecipitation (ChIP) cloning assay to isolate cis-regulatory regions bound by Pax7 in vivo. Sequencing and genomic localization of a library of chromatin-DNA fragments bound by Pax7 has identified 34 candidate Pax7 target genes, with occupancy of a selection confirmed with independent chromatin enrichment tests (ChIP-PCR). To assess the capacity of Pax7 to regulate transcription from these loci, we have cloned alternate transcripts of Pax7 (differing significantly in their DNA binding domain) into expression vectors and transfected cultured cells with these constructs, then analyzed target gene expression levels using RT-PCR. We show that Pax7 directly occupies sites within genes encoding transcription factors Gbx1 and Eya4, the neurogenic cytokine receptor ciliary neurotrophic factor receptor, the neuronal potassium channel Kcnk2, and the signal transduction kinase Camk1d in vivo and regulates the transcriptional state of these genes in cultured cells. This analysis gives us greater insight into the direct functional role played by Pax7 during embryonic development.

  3. Thesaurus-based disambiguation of gene symbols

    Directory of Open Access Journals (Sweden)

    Wain Hester M

    2005-06-01

    Full Text Available Abstract Background Massive text mining of the biological literature holds great promise of relating disparate information and discovering new knowledge. However, disambiguation of gene symbols is a major bottleneck. Results We developed a simple thesaurus-based disambiguation algorithm that can operate with very little training data. The thesaurus comprises the information from five human genetic databases and MeSH. The extent of the homonym problem for human gene symbols is shown to be substantial (33% of the genes in our combined thesaurus had one or more ambiguous symbols, not only because one symbol can refer to multiple genes, but also because a gene symbol can have many non-gene meanings. A test set of 52,529 Medline abstracts, containing 690 ambiguous human gene symbols taken from OMIM, was automatically generated. Overall accuracy of the disambiguation algorithm was up to 92.7% on the test set. Conclusion The ambiguity of human gene symbols is substantial, not only because one symbol may denote multiple genes but particularly because many symbols have other, non-gene meanings. The proposed disambiguation approach resolves most ambiguities in our test set with high accuracy, including the important gene/not a gene decisions. The algorithm is fast and scalable, enabling gene-symbol disambiguation in massive text mining applications.

  4. Focusing on shared subpockets - new developments in fragment based drug discovery

    Science.gov (United States)

    Abdelraheem, Eman M. M.; Camacho, Carlos; Dömling, Alexander

    2016-01-01

    Introduction Protein–protein interactions (PPIs) are important targets for understanding fundamental biology and for the development of therapeutic agents. Based on different physicochemical properties, numerous pieces of software (e.g PocketQuery, Anchor and FTMap) have been reported to find pockets on protein surfaces and have applications in facilitating the design and discovery of small molecular weight compounds which bind to these pockets. Areas covered The authors discuss a pocket-centric method of analyzing protein-protein interaction interfaces, which prioritize their pockets for small molecule drug discovery and the importance of multicomponent reaction (MCR) chemistry as starting points for undruggable targets. The authors also provide their perspectives on the field Expert opinion Only the tight interplay of efficient computational methods capable of screening a large chemical space and fast synthetic chemistry will lead to progress in the rational design of PPI antagonists in the future. Early drug discovery platforms will also benefit from efficient rapid feedback loops from early clinical research back to molecular design and the medicinal chemistry bench. PMID:26296101

  5. A REGISTRY BASED DISCOVERY MECHANISM FOR E-LEARNING WEB SERVICES

    Directory of Open Access Journals (Sweden)

    Demian Antony D’Mello

    2012-10-01

    Full Text Available E-learning is currently taking the shape of a Web Service in various applications i.e. learners can search for suitable content, book it, pay for it and consume it. This paper shows how the search aspects for e-learning content can technically be combined with the recent standardization efforts that aim at content exchangeability and efficient reuse. A repository for learning object publication and search is proposed that essentially adapts the UDDI framework used in commercial Web Services to the e-learning context. To adopt Web Services technology towards the reusability and aggregation of e-learning services, the conceptual Web Services architecture and its building blocks need to be augmented. The objective of this research is to design broker based registry architecture for e- Web services which facilitates effective elearning content/service discovery for the consumption or composition. The implementation followed by experimentation showed that, the proposed e-learning discovery architecture facilitates effective discovery with moderate performance in terms of overall response.

  6. Fuzzy-Based Knowledge Discovery from Heterogeneous Data in Planting Systems for Elderly LOHAS

    Institute of Scientific and Technical Information of China (English)

    Hung-Chih Hsueh; Jung-Yi Jiang; Jen-Sheng Tsai; Wen-Hao Tsai; Kuan-Rong Lee; Yau-Hwang Kuo

    2015-01-01

    Abstract⎯In this paper, we propose a knowledge discovery method based on the fuzzy set theory to help elders with plant cultivation. Initially, the fuzzy sets are constructed by using the feature selection and statistical interval estimation. The min-max inference and the center of gravity defuzzification method are then used to output a candidate pattern set. Finally, a pattern discovery is adopted to obtain the patterns from the candidate set for the cultivation suggestions by considering the frequency weight and user’s experience. In order to demonstrate the performance of our method in planting systems, we conduct a clicks-and-mortar cultivation platform, namely Eden Garden, for the elderly lifestyles of health and sustainability (LOHAS). The experimental results show that the accuracy rate of our knowledge discovery method can reach up to 85%. Moreover, the results of the LOHAS index scale table present that the happiness of the elders is increasing while the elders are using our proposed method.

  7. Symbolic representation based on trend features for knowledge discovery in long time series

    Institute of Scientific and Technical Information of China (English)

    Hong YIN; Shu-qiang YANG; Xiao-qian ZHU; Shao-dong MA; Lu-min ZHANG

    2015-01-01

    The symbolic representation of time series has attracted much research interest recently. The high dimensionality typical of the data is challenging, especially as the time series becomes longer. The wide distribution of sensors collecting more and more data exacerbates the problem. Representing a time series effectively is an essential task for decision-making activities such as classification, prediction, and knowledge discovery. In this paper, we propose a new symbolic representation method for long time series based on trend features, called trend feature symbolic approximation (TFSA). The method uses a two-step mechanism to segment long time series rapidly. Unlike some previous symbolic methods, it focuses on retaining most of the trend features and patterns of the original series. A time series is represented by trend symbols, which are also suitable for use in knowledge discovery, such as association rules mining. TFSA provides the lower bounding guarantee. Experimental results show that, compared with some previous methods, it not only has better segmentation efficiency and classification accuracy, but also is applicable for use in knowledge discovery from time series.

  8. Integration of lyoplate based flow cytometry and computational analysis for standardized immunological biomarker discovery.

    Directory of Open Access Journals (Sweden)

    Federica Villanova

    Full Text Available Discovery of novel immune biomarkers for monitoring of disease prognosis and response to therapy in immune-mediated inflammatory diseases is an important unmet clinical need. Here, we establish a novel framework for immunological biomarker discovery, comparing a conventional (liquid flow cytometry platform (CFP and a unique lyoplate-based flow cytometry platform (LFP in combination with advanced computational data analysis. We demonstrate that LFP had higher sensitivity compared to CFP, with increased detection of cytokines (IFN-γ and IL-10 and activation markers (Foxp3 and CD25. Fluorescent intensity of cells stained with lyophilized antibodies was increased compared to cells stained with liquid antibodies. LFP, using a plate loader, allowed medium-throughput processing of samples with comparable intra- and inter-assay variability between platforms. Automated computational analysis identified novel immunophenotypes that were not detected with manual analysis. Our results establish a new flow cytometry platform for standardized and rapid immunological biomarker discovery with wide application to immune-mediated diseases.

  9. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications.

    Science.gov (United States)

    Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

    2014-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu.

  10. Context-aware computing-based reducing cost of service method in resource discovery and interaction

    Institute of Scientific and Technical Information of China (English)

    TANG Shan-cheng; HOU Yi-bin

    2004-01-01

    Reducing cost of service is an important goal for resource discovery and interaction technologies. The shortcomings of transhipment-method and hibernation-method are to increase holistic cost of service and to slower resource discovery respectively. To overcome these shortcomings, a context-aware computing-based method is developed. This method, firstly,analyzes the courses of devices using resource discovery and interaction technologies to identify some types of context related to reducing cost of service, then, chooses effective methods such as stopping broadcast and hibernation to reduce cost of service according to information supplied by the context but not the transhipment-method's simple hibernations. The results of experiments indicate that under the worst condition this method overcomes the shortcomings of transhipment-method, makes the "poor" devices hibernate longer than hibernation-method to reduce cost of service more effectively, and discovers resources faster than hibernation-method; under the best condition it is far better than hibernation-method in all aspects.

  11. Transcriptome analysis and discovery of genes involved in immune pathways from hepatopancreas of microbial challenged mitten crab Eriocheir sinensis.

    Directory of Open Access Journals (Sweden)

    Xihong Li

    Full Text Available BACKGROUND: The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq technology provides a powerful and efficient method for transcript analysis and immune gene discovery. METHODS/PRINCIPAL FINDINGS: A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 10(8 cfu·mL(-1 was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr database. For function classification and pathway assignment, 18,734 (36.00% unigenes were categorized to three Gene Ontology (GO categories, 12,243 (23.51% were classified to 25 Clusters of Orthologous Groups (COG, and 8,983 (17.25% were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. CONCLUSIONS/SIGNIFICANCE: This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab.

  12. Network-based drug discovery by integrating systems biology and computational technologies.

    Science.gov (United States)

    Leung, Elaine L; Cao, Zhi-Wei; Jiang, Zhi-Hong; Zhou, Hua; Liu, Liang

    2013-07-01

    Network-based intervention has been a trend of curing systemic diseases, but it relies on regimen optimization and valid multi-target actions of the drugs. The complex multi-component nature of medicinal herbs may serve as valuable resources for network-based multi-target drug discovery due to its potential treatment effects by synergy. Recently, robustness of multiple systems biology platforms shows powerful to uncover molecular mechanisms and connections between the drugs and their targeting dynamic network. However, optimization methods of drug combination are insufficient, owning to lacking of tighter integration across multiple '-omics' databases. The newly developed algorithm- or network-based computational models can tightly integrate '-omics' databases and optimize combinational regimens of drug development, which encourage using medicinal herbs to develop into new wave of network-based multi-target drugs. However, challenges on further integration across the databases of medicinal herbs with multiple system biology platforms for multi-target drug optimization remain to the uncertain reliability of individual data sets, width and depth and degree of standardization of herbal medicine. Standardization of the methodology and terminology of multiple system biology and herbal database would facilitate the integration. Enhance public accessible databases and the number of research using system biology platform on herbal medicine would be helpful. Further integration across various '-omics' platforms and computational tools would accelerate development of network-based drug discovery and network medicine.

  13. Discovery of Phytophthora infestans genes expressed in planta through mining of cDNA libraries.

    Directory of Open Access Journals (Sweden)

    Roberto Sierra

    Full Text Available BACKGROUND: Phytophthora infestans (Mont. de Bary causes late blight of potato and tomato, and has a broad host range within the Solanaceae family. Most studies of the Phytophthora--Solanum pathosystem have focused on gene expression in the host and have not analyzed pathogen gene expression in planta. METHODOLOGY/PRINCIPAL FINDINGS: We describe in detail an in silico approach to mine ESTs from inoculated host plants deposited in a database in order to identify particular pathogen sequences associated with disease. We identified candidate effector genes through mining of 22,795 ESTs corresponding to P. infestans cDNA libraries in compatible and incompatible interactions with hosts from the Solanaceae family. CONCLUSIONS/SIGNIFICANCE: We annotated genes of P. infestans expressed in planta associated with late blight using different approaches and assigned putative functions to 373 out of the 501 sequences found in the P. infestans genome draft, including putative secreted proteins, domains associated with pathogenicity and poorly characterized proteins ideal for further experimental studies. Our study provides a methodology for analyzing cDNA libraries and provides an understanding of the plant--oomycete pathosystems that is independent of the host, condition, or type of sample by identifying genes of the pathogen expressed in planta.

  14. rVISTA for Comparative Sequence-Based Discovery of Functional Transcription Factor Binding Sites

    Energy Technology Data Exchange (ETDEWEB)

    Loots, Gabriela G.; Ovcharenko, Ivan; Pachter, Lior; Dubchak, Inna; Rubin, Edward M.

    2002-03-08

    Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVISTA, for high-throughput discovery of cis-regulatory elements that combines transcription factor binding site prediction and the analysis of inter-species sequence conservation. Here, we illustrate the ability of rVISTA to identify true transcription factor binding sites through the analysis of AP-1 and NFAT binding sites in the 1 Mb well-annotated cytokine gene cluster1 (Hs5q31; Mm11). The exploitation of orthologous human-mouse data set resulted in the elimination of 95 percent of the 38,000 binding sites predicted upon analysis of the human sequence alone, while it identified 87 percent of the experimentally verified binding sites in this region.

  15. Discovery and identification of candidate genes from the chitinase gene family for Verticillium dahliae resistance in cotton.

    Science.gov (United States)

    Xu, Jun; Xu, Xiaoyang; Tian, Liangliang; Wang, Guilin; Zhang, Xueying; Wang, Xinyu; Guo, Wangzhen

    2016-06-29

    Verticillium dahliae, a destructive and soil-borne fungal pathogen, causes massive losses in cotton yields. However, the resistance mechanism to V. dahilae in cotton is still poorly understood. Accumulating evidence indicates that chitinases are crucial hydrolytic enzymes, which attack fungal pathogens by catalyzing the fungal cell wall degradation. As a large gene family, to date, the chitinase genes (Chis) have not been systematically analyzed and effectively utilized in cotton. Here, we identified 47, 49, 92, and 116 Chis from four sequenced cotton species, diploid Gossypium raimondii (D5), G. arboreum (A2), tetraploid G. hirsutum acc. TM-1 (AD1), and G. barbadense acc. 3-79 (AD2), respectively. The orthologous genes were not one-to-one correspondence in the diploid and tetraploid cotton species, implying changes in the number of Chis in different cotton species during the evolution of Gossypium. Phylogenetic classification indicated that these Chis could be classified into six groups, with distinguishable structural characteristics. The expression patterns of Chis indicated their various expressions in different organs and tissues, and in the V. dahliae response. Silencing of Chi23, Chi32, or Chi47 in cotton significantly impaired the resistance to V. dahliae, suggesting these genes might act as positive regulators in disease resistance to V. dahliae.

  16. Strategies for enhancing the effectiveness of metagenomic-based enzyme discovery in lignocellulytic microbial communities

    Energy Technology Data Exchange (ETDEWEB)

    DeAngelis, K.M.; Gladden, J.G.; Allgaier, M.; D' haeseleer, P.; Fortney, J.L.; Reddy, A.; Hugenholtz, P.; Singer, S.W.; Vander Gheynst, J.; Silver, W.L.; Simmons, B.; Hazen, T.C.

    2010-03-01

    Producing cellulosic biofuels from plant material has recently emerged as a key U.S. Department of Energy goal. For this technology to be commercially viable on a large scale, it is critical to make production cost efficient by streamlining both the deconstruction of lignocellulosic biomass and fuel production. Many natural ecosystems efficiently degrade lignocellulosic biomass and harbor enzymes that, when identified, could be used to increase the efficiency of commercial biomass deconstruction. However, ecosystems most likely to yield relevant enzymes, such as tropical rain forest soil in Puerto Rico, are often too complex for enzyme discovery using current metagenomic sequencing technologies. One potential strategy to overcome this problem is to selectively cultivate the microbial communities from these complex ecosystems on biomass under defined conditions, generating less complex biomass-degrading microbial populations. To test this premise, we cultivated microbes from Puerto Rican soil or green waste compost under precisely defined conditions in the presence dried ground switchgrass (Panicum virgatum L.) or lignin, respectively, as the sole carbon source. Phylogenetic profiling of the two feedstock-adapted communities using SSU rRNA gene amplicon pyrosequencing or phylogenetic microarray analysis revealed that the adapted communities were significantly simplified compared to the natural communities from which they were derived. Several members of the lignin-adapted and switchgrass-adapted consortia are related to organisms previously characterized as biomass degraders, while others were from less well-characterized phyla. The decrease in complexity of these communities make them good candidates for metagenomic sequencing and will likely enable the reconstruction of a greater number of full length genes, leading to the discovery of novel lignocellulose-degrading enzymes adapted to feedstocks and conditions of interest.

  17. Representing Instructional Material for Scenario-Based Guided-Discovery Courseware

    Energy Technology Data Exchange (ETDEWEB)

    Greitzer, Frank L.; Merrill, M. DAVID.; Rice, Douglas M.; Curtis, Darren S.

    2004-12-06

    The focus of this paper is to discuss paradigms for learning that are based on sound principles of human learning and cognition, and to discuss technical challenges that must be overcome in achieving this research goal through instructional system design (ISD) approaches that are cost-effective as well as conformant with today's interactive multimedia instruction standards. Fundamental concepts are to: engage learners to solve real-world problems (progress from simple to complex); relate material to previous experience; demonstrate what is to be learned using interactive, problem-centered activities rather than passive exposure to material; require learners to use their new knowledge to solve problems that demonstrate their knowledge in a relevant applied setting; and guide the learner with feedback and coaching early, then gradually withdraw this support as learning progresses. Many of these principles have been put into practice by employing interactive learning objects as re-usable components of larger, more integrated exercises. A challenge is to make even more extensive use of interactive, scenario-based activities within a guided-discovery framework. Because the design and construction of interactive, scenario-based learning objects and more complex integrated exercises is labor-intensive, this paper explores the use of interactive learning objects and associated representation schema for instructional content to facilitate development of tools for creating scenario-based, guided-discovery courseware.

  18. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    Science.gov (United States)

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  19. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages.

    Directory of Open Access Journals (Sweden)

    Huanhuan Gao

    Full Text Available Bradysia odoriphaga (Diptera: Sciaridae is the most important pest of Chinese chive (Allium tuberosum in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125 bp contained in 7,632,430 contigs, 46.21% of which were annotated from non-redundant protein (NR, Gene Ontology (GO, Clusters of Orthologous Groups (COG, Eukaryotic Orthologous Groups (KOG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD. The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga.

  20. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses.

    Science.gov (United States)

    Jiao, Yinping; Burke, John; Chopra, Ratan; Burow, Gloria; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves; Ware, Doreen; Xin, Zhanguo

    2016-07-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches.

  1. Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing

    NARCIS (Netherlands)

    I. Jansen (Iris); Ye, H. (Hui); Heetveld, S. (Sasja); Lechler, M.C. (Marie C.); Michels, H. (Helen); Seinstra, R.I. (Renée I.); Lubbe, S.J. (Steven J.); Drouet, V. (Valérie); S. Lesage (Suzanne); E. Majounie (Elisa); Gibbs, J.R. (J.Raphael); M.A. Nalls (Michael); M. Ryten (Mina); Botia, J.A. (Juan A.); J. Vandrovcova (Jana); J. Simón-Sánchez (Javier); Castillo-Lizardo, M. (Melissa); P. Rizzu (Patrizia); Blauwendraat, C. (Cornelis); Chouhan, A.K. (Amit K.); Li, Y. (Yarong); Yogi, P. (Puja); N. Amin (Najaf); C.M. van Duijn (Cock); Morris, H.R. (Huw R.); Brice, A. (Alexis); A. Singleton (Andrew); David, D.C. (Della C.); Nollen, E.A. (Ellen A.); A. Jain (Ashok); J.M. Shulman; P. Heutink (Peter); D.G. Hernandez (Dena); S. Arepalli (Sampath); J. Brooks (Janet); Price, R. (Ryan); Nicolas, A. (Aude); S. Chong (Sean); M.R. Cookson (Mark); A. Dillman (Allissa); M. Moore (Matt); B.J. Traynor (Bryan); A. Singleton (Andrew); V. Plagnol (Vincent); Nicholas W Wood,; U.-M. Sheerin (Una-Marie); Jose M Bras,; K. Charlesworth (Kate); M. Gardner (Mac); R. Guerreiro (Rita); D. Trabzuni (Danyah); Hardy, J. (John); M. Sharma; M. Saad (Mohamad); Javier Simón-Sánchez,; C. Schulte (Claudia); J.C. Corvol (Jean-Christophe); Dürr, A. (Alexandra); M. Vidailhet (M.); S. Sveinbjörnsdóttir (Sigurlaug); R.A. Barker (Roger); Caroline H Williams-Gray,; Y. Ben-Shlomo; H.W. Berendse (Henk W.); K.D. van Dijk (Karin); D. Berg (Daniela); K. Brockmann; K.D. Wurster (Kathrin); Mätzler, W. (Walter); Gasser, T. (Thomas); M. Martinez (Maria); R.M.A. de Bie (Rob); A. Biffi (Alessandro); D. Velseboer (Daan); B.R. Bloem (Bastiaan); B. Post (Bart); M. Wickremaratchi (Mirdhu); B. van de Warrenburg (Bart); Z. Bochdanovits (Zoltan); M. von Bonin (Malte); H. Pétursson (Hjörvar); O. Riess (Olaf); D.J. Burn (David); Lubbe, S. (Steven); Cooper, J.M. (J Mark); N.H. McNeill (Nathan); Schapira, A. (Anthony); Lungu, C. (Codrin); Chen, H. (Honglei); Dong, J. (Jing); Chinnery, P.F. (Patrick F.); G. Hudson (Gavin); Clarke, C.E. (Carl E.); C. Moorby (Catriona); C. Counsell (Carl); P. Damier (Philippe); J.-F. Dartigues; P. Deloukas (Panagiotis); E. Gray (Emma); T. Edkins (Ted); Hunt, S.E. (Sarah E.); S.C. Potter (Simon); A. Tashakkori-Ghanbaria (Avazeh); G. Deuschl (Günther); D. Lorenz (Delia); D.T. Dexter (David); F. Durif (Frank); J. Evans (Jonathan Mark); Langford, C. (Cordelia); T. Foltynie (Thomas); A.M. Goate (Alison); C. Harris (Clare); J.J. van Hilten (Jacobus); A. Hofman (Albert); J.R. Hollenbeck (John R.); J.L. Holton (Janice); Hu, M. (Michele); X. Huang (Xiaohong); Illig, T. (Thomas); P.V. Jónsson (Pálmi); J.-C. Lambert; S.S. O'Sullivan (Sean); T. Revesz (Tamas); K. Shaw (Karen); A.J. Lees (Andrew); P. Lichtner (Peter); P. Limousin (Patricia); G. Lopez; Escott-Price, V. (Valentina); J. Pearson (Justin); N. Williams (Nigel); E. Mudanohwo (Ese); J.S. Perlmutter (Joel); Pollak, P. (Pierre); F. Rivadeneira Ramirez (Fernando); A.G. Uitterlinden (André); S.J. Sawcer (Stephen); H. Scheffer (Hans); I. Shoulson (Ira); L. Shulman (Lee); Smith, C. (Colin); R. Walker (Robert); C.C.A. Spencer (Chris C.); A. Strange (Amy); H. Stefansson (Hreinn); F. Bettella (Francesco); J-A. Zwart (John-Anker); Stockton, J.D. (Joanna D.); D. Talbot; C.M. Tanner (Carlie); F. Tison (François); S. Winder-Rhodes (Sophie); K.P. Bhatia (Kailash)

    2017-01-01

    textabstractBackground: Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we perform

  2. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Directory of Open Access Journals (Sweden)

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  3. Gene expression and epigenetic discovery screen reveal methylation of SFRP2 in prostate cancer.

    LENUS (Irish Health Repository)

    Perry, Antoinette S

    2013-04-15

    Aberrant activation of Wnts is common in human cancers, including prostate. Hypermethylation associated transcriptional silencing of Wnt antagonist genes SFRPs (Secreted Frizzled-Related Proteins) is a frequent oncogenic event. The significance of this is not known in prostate cancer. The objectives of our study were to (i) profile Wnt signaling related gene expression and (ii) investigate methylation of Wnt antagonist genes in prostate cancer. Using TaqMan Low Density Arrays, we identified 15 Wnt signaling related genes with significantly altered expression in prostate cancer; the majority of which were upregulated in tumors. Notably, histologically benign tissue from men with prostate cancer appeared more similar to tumor (r = 0.76) than to benign prostatic hyperplasia (BPH; r = 0.57, p < 0.001). Overall, the expression profile was highly similar between tumors of high (≥ 7) and low (≤ 6) Gleason scores. Pharmacological demethylation of PC-3 cells with 5-Aza-CdR reactivated 39 genes (≥ 2-fold); 40% of which inhibit Wnt signaling. Methylation frequencies in prostate cancer were 10% (2\\/20) (SFRP1), 64.86% (48\\/74) (SFRP2), 0% (0\\/20) (SFRP4) and 60% (12\\/20) (SFRP5). SFRP2 methylation was detected at significantly lower frequencies in high-grade prostatic intraepithelial neoplasia (HGPIN; 30%, (6\\/20), p = 0.0096), tumor adjacent benign areas (8.82%, (7\\/69), p < 0.0001) and BPH (11.43% (4\\/35), p < 0.0001). The quantitative level of SFRP2 methylation (normalized index of methylation) was also significantly higher in tumors (116) than in the other samples (HGPIN = 7.45, HB = 0.47, and BPH = 0.12). We show that SFRP2 hypermethylation is a common event in prostate cancer. SFRP2 methylation in combination with other epigenetic markers may be a useful biomarker of prostate cancer.

  4. Mechanism of FCA-based Folksonomy Knowledge Discovery%基于FCA的folksonomy知识发现机理研究

    Institute of Scientific and Technical Information of China (English)

    张云中

    2012-01-01

    folksonomy知识发现已成为解决folksonomy自身语义问题和用户问题的有效途径。针对基于FCA的folk—sonomy知识发现的机理,首先探寻当前基于FCA的folksonomy知识发现研究的不足,进而在剖析基于FCA的folksonomy知识发现要素和要素关系的基础上,用螺旋研究模型揭示基于FCA的folksonomy知识发现的客观规律,最终探寻出基于FCA的folksonomy知识发现的三个核心内容:folksonomy用户行为、folksonomy用户偏好和folksonomy语义关系。%Folksonomy knowledge discovery has become an effective way to solve the semantic problems and user problems in folkson- omy. In allusion to the mechanism of FCA-based folksonomy knowledge discovery, firstly, the lack of current research on FCA-based folksonomy knowledge discovery is summarized. Then on the basis of analyzing the elements of FCA-based folksonomy knowledge discovery and the relationship among them, the spiral evolution model is built to reveal the objective laws of FCA-based folksonomy knowledge discovery. Eventually three core contents of FCA-based folksonomy knowledge discovery are determined, which are folksonomy user behavior, folksonomy user preferences and folksonomy semantic relations.

  5. Leveraging a Sturge-Weber Gene Discovery: An Agenda for Future Research.

    Science.gov (United States)

    Comi, Anne M; Sahin, Mustafa; Hammill, Adrienne; Kaplan, Emma H; Juhász, Csaba; North, Paula; Ball, Karen L; Levin, Alex V; Cohen, Bernard; Morris, Jill; Lo, Warren; Roach, E Steve

    2016-05-01

    Sturge-Weber syndrome (SWS) is a vascular neurocutaneous disorder that results from a somatic mosaic mutation in GNAQ, which is also responsible for isolated port-wine birthmarks. Infants with SWS are born with a cutaneous capillary malformation (port-wine birthmark) of the forehead or upper eyelid which can signal an increased risk of brain and/or eye involvement prior to the onset of specific symptoms. This symptom-free interval represents a time when a targeted intervention could help to minimize the neurological and ophthalmologic manifestations of the disorder. This paper summarizes a 2015 SWS workshop in Bethesda, Maryland that was sponsored by the National Institutes of Health. Meeting attendees included a diverse group of clinical and translational researchers with a goal of establishing research priorities for the next few years. The initial portion of the meeting included a thorough review of the recent genetic discovery and what is known of the pathogenesis of SWS. Breakout sessions related to neurology, dermatology, and ophthalmology aimed to establish SWS research priorities in each field. Key priorities for future development include the need for clinical consensus guidelines, further work to develop a clinical trial network, improvement of tissue banking for research purposes, and the need for multiple animal and cell culture models of SWS.

  6. Fragment-based discovery of hepatitis C virus NS5b RNA polymerase inhibitors

    Energy Technology Data Exchange (ETDEWEB)

    Antonysamy, Stephen S.; Aubol, Brandon; Blaney, Jeff; Browner, Michelle F.; Giannetti, Anthony M.; Harris, Seth F.; Hébert, Normand; Hendle, Jörg; Hopkins, Stephanie; Jefferson, Elizabeth; Kissinger, Charles; Leveque, Vincent; Marciano, David; McGee, Ethel; Nájera, Isabel; Nolan, Brian; Tomimoto, Masaki; Torres, Eduardo; Wright, Tobi (SGX); (Roche)

    2009-07-22

    Non-nucleoside inhibitors of HCV NS5b RNA polymerase were discovered by a fragment-based lead discovery approach, beginning with crystallographic fragment screening. The NS5b binding affinity and biochemical activity of fragment hits and inhibitors was determined by surface plasmon resonance (Biacore) and an enzyme inhibition assay, respectively. Crystallographic fragment screening hits with {approx}1-10 mM binding affinity (K{sub D}) were iteratively optimized to give leads with {approx}200 nM biochemical activity and low {micro}M cellular activity in a Replicon assay.

  7. Novel Technology for Protein-Protein Interaction-based Targeted Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jung Me Hwang

    2011-12-01

    Full Text Available We have developed a simple but highly efficient in-cell protein-protein interaction (PPI discovery system based on the translocation properties of protein kinase C- and its C1a domain in live cells. This system allows the visual detection of trimeric and dimeric protein interactions including cytosolic, nuclear, and/or membrane proteins with their cognate ligands. In addition, this system can be used to identify pharmacological small compounds that inhibit specific PPIs. These properties make this PPI system an attractive tool for screening drug candidates and mapping the protein interactome.

  8. Structure Based Discovery of Small Molecules to Regulate the Activity of Human Insulin Degrading Enzyme

    OpenAIRE

    Bilal Çakir; Onur Dağliyan; Ezgi Dağyildiz; İbrahim Bariş; Ibrahim Halil Kavakli; Seda Kizilel; Metin Türkay

    2012-01-01

    Structure Based Discovery of Small Molecules to Regulate the Activity of Human Insulin Degrading Enzyme Bilal C¸ akir1, Onur Dag˘ liyan1, Ezgi Dag˘ yildiz1, I˙brahim Baris¸1, Ibrahim Halil Kavakli1,2*, Seda Kizilel1*, Metin Tu¨ rkay3* 1 Department of Chemical and Biological Engineering, Koc¸ University, Sariyer, Istanbul, Turkey, 2 Department of Molecular Biology and Genetics, Koc¸ University, Sariyer, Istanbul, Turkey, 3 Department of Industrial Engineering, Koc¸ University...

  9. Synthetic lethality-based targets for discovery of new cancer therapeutics.

    Science.gov (United States)

    Weidle, Ulrich H; Maisel, Daniela; Eick, Dirk

    2011-01-01

    Synthetic lethality is based on the incompatibility of cell survival with the loss of function of two or more genes, not with loss of function of a single gene. If targets of synthetic lethality are deregulated or mutated in cancer cells, the strategy of synthetic lethality can result in significant increase of therapeutic efficacy and a favourable therapeutic window. In this review, we discuss synthetic lethality based on deficient DNA repair mechanisms, activating mutations of RAS, loss of function mutations of the tumor suppressor genes p53, Rb and von Hippel-Lindau, and disruption of interactive protein kinase networks in the context of development of new anticancer agents.

  10. A Novel Mobile Video Community Discovery Scheme Using Ontology-Based Semantical Interest Capture

    Directory of Open Access Journals (Sweden)

    Ruiling Zhang

    2016-01-01

    Full Text Available Leveraging network virtualization technologies, the community-based video systems rely on the measurement of common interests to define and steady relationship between community members, which promotes video sharing performance and improves scalability community structure. In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI. An ontology-based semantical extension approach is proposed, which describes video content and measures video similarity according to video key word selection methods. In order to reduce the calculation load of video similarity, VCOSI designs a prefix-filtering-based estimation algorithm to decrease energy consumption of mobile nodes. VCOSI further proposes a member relationship estimate method to construct scalable and resilient node communities, which promotes video sharing capacity of video systems with the flexible and economic community maintenance. Extensive tests show how VCOSI obtains better performance results in comparison with other state-of-the-art solutions.

  11. Physics-based gene identification: proof of concept for Plasmodium falciparum.

    Science.gov (United States)

    Yeramian, Edouard; Bonnefoy, Serge; Langsley, Gordon

    2002-01-01

    The ab initio prediction of new genes in eukaryotic genomes represents a difficult task, notably for the identification of complex split genes. A Physics-Based Gene Identification (PBGI) method was formulated recently (Yeramian, Gene, 255, 139-150, 151-168, 2000a,b) to address this problem, taking as a model the Plasmodium falciparum genome. Here, the predictive power of this method is put under experimental test for this genome. The presented results demonstrate the usefulness of the PBGI as a gene-identification tool for P. falciparum, notably for the discovery of new genes with no homology to known genes. Perspectives opened by this new method for other eukaryotic genomes are also mentioned.

  12. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Science.gov (United States)

    Hsu, Yi-Hsiang; Zillikens, M Carola; Wilson, Scott G; Farber, Charles R; Demissie, Serkalem; Soranzo, Nicole; Bianchi, Estelle N; Grundberg, Elin; Liang, Liming; Richards, J Brent; Estrada, Karol; Zhou, Yanhua; van Nas, Atila; Moffatt, Miriam F; Zhai, Guangju; Hofman, Albert; van Meurs, Joyce B; Pols, Huibert A P; Price, Roger I; Nilsson, Olle; Pastinen, Tomi; Cupples, L Adrienne; Lusis, Aldons J; Schadt, Eric E; Ferrari, Serge; Uitterlinden, André G; Rivadeneira, Fernando; Spector, Timothy D; Karasik, David; Kiel, Douglas P

    2010-06-10

    Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS) have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD) at the lumbar spine (LS) and femoral neck (FN), as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW). A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8)), 2q11.2 (TBC1D8), and 18q11.2 (OSBPL1A), and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13); SOX6, p = 6.4x10(-10)) associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD) did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant to the

  13. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Directory of Open Access Journals (Sweden)

    Yi-Hsiang Hsu

    2010-06-01

    Full Text Available Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD at the lumbar spine (LS and femoral neck (FN, as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW. A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8, 2q11.2 (TBC1D8, and 18q11.2 (OSBPL1A, and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13; SOX6, p = 6.4x10(-10 associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant

  14. A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Directory of Open Access Journals (Sweden)

    Ginés D. Guerrero

    2014-01-01

    Full Text Available Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO. This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.

  15. Morph-X-Select: Morphology-based tissue aptamer selection for ovarian cancer biomarker discovery

    Science.gov (United States)

    Wang, Hongyu; Li, Xin; Volk, David E.; Lokesh, Ganesh L.-R.; Elizondo-Riojas, Miguel-Angel; Li, Li; Nick, Alpa M.; Sood, Anil K.; Rosenblatt, Kevin P.; Gorenstein, David G.

    2016-01-01

    High affinity aptamer-based biomarker discovery has the advantage of simultaneously discovering an aptamer affinity reagent and its target biomarker protein. Here, we demonstrate a morphology-based tissue aptamer selection method that enables us to use tissue sections from individual patients and identify high-affinity aptamers and their associated target proteins in a systematic and accurate way. We created a combinatorial DNA aptamer library that has been modified with thiophosphate substitutions of the phosphate ester backbone at selected 5′dA positions for enhanced nuclease resistance and targeting. Based on morphological assessment, we used image-directed laser microdissection (LMD) to dissect regions of interest bound with the thioaptamer (TA) library and further identified target proteins for the selected TAs. We have successfully identified and characterized the lead candidate TA, V5, as a vimentin-specific sequence that has shown specific binding to tumor vasculature of human ovarian tissue and human microvascular endothelial cells. This new Morph-X-Select method allows us to select high-affinity aptamers and their associated target proteins in a specific and accurate way, and could be used for personalized biomarker discovery to improve medical decision-making and to facilitate the development of targeted therapies to achieve more favorable outcomes. PMID:27839510

  16. Discovery of Nuclear-Encoded Genes for the Neurotoxin Saxitoxin in Dinoflagellates

    OpenAIRE

    Anke Stüken; Orr, Russell J. S.; Ralf Kellmann; Murray, Shauna A.; Neilan, Brett A.; Kjetill S Jakobsen

    2011-01-01

    Saxitoxin is a potent neurotoxin that occurs in aquatic environments worldwide. Ingestion of vector species can lead to paralytic shellfish poisoning, a severe human illness that may lead to paralysis and death. In freshwaters, the toxin is produced by prokaryotic cyanobacteria; in marine waters, it is associated with eukaryotic dinoflagellates. However, several studies suggest that saxitoxin is not produced by dinoflagellates themselves, but by co-cultured bacteria. Here, we show that genes ...

  17. Discovery and functional assessment of gene variants in the vascular endothelial growth factor pathway

    OpenAIRE

    Paré-Brunet, Laia; Glubb, Dylan; Evans, Patrick; Berenguer-Llergo, Antoni; Etheridge, Amy S.; Skol, Andrew D.; Di Rienzo, Anna; Duan, Shiwei; Gamazon, Eric R.; Innocenti, Federico

    2013-01-01

    Angiogenesis is a host-mediated mechanism in disease pathophysiology. The vascular endothelial growth factor (VEGF) pathway is a major determinant of angiogenesis, and a comprehensive annotation of the functional variation in this pathway is essential to understand the genetic basis of angiogenesis-related diseases. We assessed the allelic heterogeneity of gene expression, population specificity of cis expression quantitative trait loci (eQTLs), and eQTL function in luciferase assays in CEU a...

  18. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    Science.gov (United States)

    2014-04-01

    candidate genes for existing prostate cancer (PC) risk-single nucleotide polymorphisms (SNPs) that could then be followed up in future studies. To accomplish...a radical prostatectomy at Mayo Clinic and were available to investigators through the Prostate Cancer SPORE. Typically, one to three pieces of...916 cases re-examined, 93 cases met the criteria above, but also contained Benign Prostatic Hyperplasia (BPH), seminal vesicle, urethra , or adjacent

  19. Discovery of nuclear-encoded genes for the neurotoxin saxitoxin in dinoflagellates.

    Science.gov (United States)

    Stüken, Anke; Orr, Russell J S; Kellmann, Ralf; Murray, Shauna A; Neilan, Brett A; Jakobsen, Kjetill S

    2011-01-01

    Saxitoxin is a potent neurotoxin that occurs in aquatic environments worldwide. Ingestion of vector species can lead to paralytic shellfish poisoning, a severe human illness that may lead to paralysis and death. In freshwaters, the toxin is produced by prokaryotic cyanobacteria; in marine waters, it is associated with eukaryotic dinoflagellates. However, several studies suggest that saxitoxin is not produced by dinoflagellates themselves, but by co-cultured bacteria. Here, we show that genes required for saxitoxin synthesis are encoded in the nuclear genomes of dinoflagellates. We sequenced >1.2×10(6) mRNA transcripts from the two saxitoxin-producing dinoflagellate strains Alexandrium fundyense CCMP1719 and A. minutum CCMP113 using high-throughput sequencing technology. In addition, we used in silico transcriptome analyses, RACE, qPCR and conventional PCR coupled with Sanger sequencing. These approaches successfully identified genes required for saxitoxin-synthesis in the two transcriptomes. We focused on sxtA, the unique starting gene of saxitoxin synthesis, and show that the dinoflagellate transcripts of sxtA have the same domain structure as the cyanobacterial sxtA genes. But, in contrast to the bacterial homologs, the dinoflagellate transcripts are monocistronic, have a higher GC content, occur in multiple copies, contain typical dinoflagellate spliced-leader sequences and eukaryotic polyA-tails. Further, we investigated 28 saxitoxin-producing and non-producing dinoflagellate strains from six different genera for the presence of genomic sxtA homologs. Our results show very good agreement between the presence of sxtA and saxitoxin-synthesis, except in three strains of A. tamarense, for which we amplified sxtA, but did not detect the toxin. Our work opens for possibilities to develop molecular tools to detect saxitoxin-producing dinoflagellates in the environment.

  20. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  1. Discovery of inhibitors of aberrant gene transcription from Libraries of DNA binding molecules: inhibition of LEF-1-mediated gene transcription and oncogenic transformation.

    Science.gov (United States)

    Stover, James S; Shi, Jin; Jin, Wei; Vogt, Peter K; Boger, Dale L

    2009-03-11

    The screening of a >9000 compound library of synthetic DNA binding molecules for selective binding to the consensus sequence of the transcription factor LEF-1 followed by assessment of the candidate compounds in a series of assays that characterized functional activity (disruption of DNA-LEF-1 binding) at the intended target and site (inhibition of intracellular LEF-1-mediated gene transcription) resulting in a desired phenotypic cellular change (inhibit LEF-1-driven cell transformation) provided two lead compounds: lefmycin-1 and lefmycin-2. The sequence of screens defining the approach assures that activity in the final functional assay may be directly related to the inhibition of gene transcription and DNA binding properties of the identified molecules. Central to the implementation of this generalized approach to the discovery of DNA binding small molecule inhibitors of gene transcription was (1) the use of a technically nondemanding fluorescent intercalator displacement (FID) assay for initial assessment of the DNA binding affinity and selectivity of a library of compounds for any sequence of interest, and (2) the technology used to prepare a sufficiently large library of DNA binding compounds.

  2. An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide-Association-Study-Identified Diabetes Genes and Drug Discovery.

    Science.gov (United States)

    Zeng, Hui; Guo, Min; Zhou, Ting; Tan, Lei; Chong, Chi Nok; Zhang, Tuo; Dong, Xue; Xiang, Jenny Zhaoying; Yu, Albert S; Yue, Lixia; Qi, Qibin; Evans, Todd; Graumann, Johannes; Chen, Shuibing

    2016-09-01

    Genome-wide association studies (GWASs) have increased our knowledge of loci associated with a range of human diseases. However, applying such findings to elucidate pathophysiology and promote drug discovery remains challenging. Here, we created isogenic human ESCs (hESCs) with mutations in GWAS-identified susceptibility genes for type 2 diabetes. In pancreatic beta-like cells differentiated from these lines, we found that mutations in CDKAL1, KCNQ1, and KCNJ11 led to impaired glucose secretion in vitro and in vivo, coinciding with defective glucose homeostasis. CDKAL1 mutant insulin+ cells were also hypersensitive to glucolipotoxicity. A high-content chemical screen identified a candidate drug that rescued CDKAL1-specific defects in vitro and in vivo by inhibiting the FOS/JUN pathway. Our approach of a proof-of-principle platform, which uses isogenic hESCs for functional evaluation of GWAS-identified loci and identification of a drug candidate that rescues gene-specific defects, paves the way for precision therapy of metabolic diseases.

  3. Gene function prediction based on the Gene Ontology hierarchical structure.

    Science.gov (United States)

    Cheng, Liangxi; Lin, Hongfei; Hu, Yuncui; Wang, Jian; Yang, Zhihao

    2014-01-01

    The information of the Gene Ontology annotation is helpful in the explanation of life science phenomena, and can provide great support for the research of the biomedical field. The use of the Gene Ontology is gradually affecting the way people store and understand bioinformatic data. To facilitate the prediction of gene functions with the aid of text mining methods and existing resources, we transform it into a multi-label top-down classification problem and develop a method that uses the hierarchical relationships in the Gene Ontology structure to relieve the quantitative imbalance of positive and negative training samples. Meanwhile the method enhances the discriminating ability of classifiers by retaining and highlighting the key training samples. Additionally, the top-down classifier based on a tree structure takes the relationship of target classes into consideration and thus solves the incompatibility between the classification results and the Gene Ontology structure. Our experiment on the Gene Ontology annotation corpus achieves an F-value performance of 50.7% (precision: 52.7% recall: 48.9%). The experimental results demonstrate that when the size of training set is small, it can be expanded via topological propagation of associated documents between the parent and child nodes in the tree structure. The top-down classification model applies to the set of texts in an ontology structure or with a hierarchical relationship.

  4. The Analysis of Image Segmentation Hierarchies with a Graph-based Knowledge Discovery System

    Science.gov (United States)

    Tilton, James C.; Cooke, diane J.; Ketkar, Nikhil; Aksoy, Selim

    2008-01-01

    Currently available pixel-based analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. A general consensus is that object-based image analysis (OBIA) is required to effectively analyze this type of data. OBIA is usually a two-stage process; image segmentation followed by an analysis of the segmented objects. We are exploring an approach to OBIA in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software developed at NASA GSFC are analyzed by the Subdue graph-based knowledge discovery system developed by a team at Washington State University. In this paper we discuss out initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and provide results on real and simulated data. We also discuss planned improvements designed to more effectively and completely convey the hierarchical segmentation information to Subdue and to improve processing efficiency.

  5. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    Energy Technology Data Exchange (ETDEWEB)

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  6. The implementation of discovery learning model based on lesson study to increase student's achievement in colloid

    Science.gov (United States)

    Suyanti, Retno Dwi; Purba, Deby Monika

    2017-03-01

    The objectives of this research are to get the increase student's achievement on the discovery learning model based on lesson study. Beside of that, this research also conducted to know the cognitive aspect. This research was done in three school that are SMA N 3 Medan. Population is all the students in SMA N 11 Medan which taken by purposive random sampling. The research instruments are achievement test instruments that have been validated. The research data analyzed by statistic using Ms Excell. The result data shows that the student's achievement taught by discovery learning model based on Lesson study higher than the student's achievement taught by direct instructional method. It can be seen from the average of gain and also proved with t-test, the normalized gain in experimental class of SMA N 11 is (0.74±0.12) and control class (0.45±0.12), at significant level α = 0.05, Ha is received and Ho is refused where tcount>ttable in SMA N 11 (9.81>1,66). Then get the improvement cognitive aspect from three of school is C2 where SMA N 11 is 0.84(high). Then the observation sheet result of lesson study from SMA N 11 92 % of student working together while 67% less in active using media.

  7. Structure-based discovery of selective serotonin 5-HT(1B) receptor ligands.

    Science.gov (United States)

    Rodríguez, David; Brea, José; Loza, María Isabel; Carlsson, Jens

    2014-08-05

    The development of safe and effective drugs relies on the discovery of selective ligands. Serotonin (5-hydroxytryptamine [5-HT]) G protein-coupled receptors are therapeutic targets for CNS disorders but are also associated with adverse drug effects. The determination of crystal structures for the 5-HT1B and 5-HT2B receptors provided an opportunity to identify subtype selective ligands using structure-based methods. From docking screens of 1.3 million compounds, 22 molecules were predicted to be selective for the 5-HT1B receptor over the 5-HT2B subtype, a requirement for safe serotonergic drugs. Nine compounds were experimentally verified as 5-HT1B-selective ligands, with up to 300-fold higher affinities for this subtype. Three of the ligands were agonists of the G protein pathway. Analysis of state-of-the-art homology models of the two 5-HT receptors revealed that the crystal structures were critical for predicting selective ligands. Our results demonstrate that structure-based screening can guide the discovery of ligands with specific selectivity profiles.

  8. Discovery and development of natural product-derived chemotherapeutic agents based on a medicinal chemistry approach.

    Science.gov (United States)

    Lee, Kuo-Hsiung

    2010-03-26

    Medicinal plants have long been an excellent source of pharmaceutical agents. Accordingly, the long-term objectives of the author's research program are to discover and design new chemotherapeutic agents based on plant-derived compound leads by using a medicinal chemistry approach, which is a combination of chemistry and biology. Different examples of promising bioactive natural products and their synthetic analogues, including sesquiterpene lactones, quassinoids, naphthoquinones, phenylquinolones, dithiophenediones, neo-tanshinlactone, tylophorine, suksdorfin, DCK, and DCP, will be presented with respect to their discovery and preclinical development as potential clinical trial candidates. Research approaches include bioactivity- or mechanism of action-directed isolation and characterization of active compounds, rational drug design-based modification and analogue synthesis, and structure-activity relationship and mechanism of action studies. Current clinical trial agents discovered by the Natural Products Research Laboratories, University of North Carolina, include bevirimat (dimethyl succinyl betulinic acid), which is now in phase IIb trials for treating AIDS. Bevirimat is also the first in a new class of HIV drug candidates called "maturation inhibitors". In addition, an etoposide analogue, GL-331, progressed to anticancer phase II clinical trials, and the curcumin analogue JC-9 is in phase II clinical trials for treating acne and in development for trials against prostate cancer. The discovery and development of these clinical trial candidates will also be discussed.

  9. Developing a distributed HTML5-based search engine for geospatial resource discovery

    Science.gov (United States)

    ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

    2013-12-01

    With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).

  10. Novel Gene Discovery of Crops in China: Status, Challenging, and Perspective%中国作物新基因发掘:现状、挑战与展望

    Institute of Scientific and Technical Information of China (English)

    邱丽娟; 王建康; 万建民; 郭勇; 黎裕; 王晓波; 周国安; 刘章雄; 周时荣; 李新海; 马有志

    2011-01-01

    a gene level and hence for molecular breeding.This paper reviewed progress of novel gene discovery studies in major crops, such as rice, wheat, maize, soybean, cotton, and oilseed rape in China.In last decade, Chinese scientists have achieved a number of breakthroughs on novel gene identification in crops, including: (1) Various distinctive materials for gene discovery were created, such as core collections of germplasms based on crop genetic diversity, establishment of genetic populations based on genetic resources with favorite traits, assessment of mutants derived from mutagenesis, and so on; (2) Technology and methods of gene discovery were further developed, especially the gene-based integration of various discovery technologies with combination of biometric algorithm improvement of gene/QTLs, and therefore the efficiency of gene discovery was increased; (3) Mapping genes/QTLs related to important agronomic traits of crops has become a common method for genetic studies.A number of genes/QTLs associated with disease and insect resistance, stress tolerance, good quality, nutrient use efficiency and high yield have been mapped, of which more than 500 genes have been positioned on chromosomes precisely by fine mapping; (4) Great progress in cloning and functional analysis of crop genes in China, particularly in rice, has drawn world-wide attention.More than 300 genes have been cloned in the main crops, among which more than 70 genes have been functionally validated in crops.While gene discovery in crops becomes more and more efficient, large-scale and towards utilization in the world, Chinese scientists are also making new findings in this field.However, the quality and quantity of crop gene discovery in China is still far from satisfying the needs for molecular breeding and the overall level of novel gene discovery is still behind top labs/institutions in the world.Gene discovery in different crops has developed unevenly, the number of genes discovered is not

  11. Yeast homologous recombination-based promoter engineering for the activation of silent natural product biosynthetic gene clusters.

    Science.gov (United States)

    Montiel, Daniel; Kang, Hahk-Soo; Chang, Fang-Yuan; Charlop-Powers, Zachary; Brady, Sean F

    2015-07-21

    Large-scale sequencing of prokaryotic (meta)genomic DNA suggests that most bacterial natural product gene clusters are not expressed under common laboratory culture conditions. Silent gene clusters represent a promising resource for natural product discovery and the development of a new generation of therapeutics. Unfortunately, the characterization of molecules encoded by these clusters is hampered owing to our inability to express these gene clusters in the laboratory. To address this bottleneck, we have developed a promoter-engineering platform to transcriptionally activate silent gene clusters in a model heterologous host. Our approach uses yeast homologous recombination, an auxotrophy complementation-based yeast selection system and sequence orthogonal promoter cassettes to exchange all native promoters in silent gene clusters with constitutively active promoters. As part of this platform, we constructed and validated a set of bidirectional promoter cassettes consisting of orthogonal promoter sequences, Streptomyces ribosome binding sites, and yeast selectable marker genes. Using these tools we demonstrate the ability to simultaneously insert multiple promoter cassettes into a gene cluster, thereby expediting the reengineering process. We apply this method to model active and silent gene clusters (rebeccamycin and tetarimycin) and to the silent, cryptic pseudogene-containing, environmental DNA-derived Lzr gene cluster. Complete promoter refactoring and targeted gene exchange in this "dead" cluster led to the discovery of potent indolotryptoline antiproliferative agents, lazarimides A and B. This potentially scalable and cost-effective promoter reengineering platform should streamline the discovery of natural products from silent natural product biosynthetic gene clusters.

  12. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome.

    Science.gov (United States)

    Hoffman, Joseph I

    2011-07-01

    Next-generation sequencing provides a powerful new approach for developing functional genomic tools for nonmodel species, helping to narrow the gap between studies of model organisms and those of natural populations. Consequently, massively parallel 454 sequencing was used to characterize a normalized cDNA library derived from skin biopsy samples of twelve Antarctic fur seal (Arctocephalus gazella) individuals. Over 412 Mb of sequence data were generated, comprising 1.4 million reads of average length 286 bp. De novo assembly using Newbler 2.3 yielded 156 contigs plus 22 869 isotigs, which in turn clustered into 18,576 isogroups. Almost half of the assembled transcript sequences showed significant similarity to the nr database, revealing a functionally diverse array of genes. Moreover, 97.9% of these mapped to the dog (Canis lupis familiaris) genome, with a strong positive relationship between the number of sequences locating to a given chromosome and the length of that chromosome in the dog indicating a broad genomic distribution. Average depth of coverage was also almost 20-fold, sufficient to detect several thousand putative microsatellite loci and single nucleotide polymorphisms. This study constitutes an important step towards developing genomic resources with which to address consequential questions in pinniped ecology and evolution. It also supports an earlier but smaller study showing that skin tissue can be a rich source of expressed genes, with important implications for studying the genomics not only of marine mammals, but also more generally of species that cannot be destructively sampled.

  13. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  14. Discovery and characterization of a novel CCND1/MRCK gene fusion in mantle cell lymphoma

    Directory of Open Access Journals (Sweden)

    Chioniso Patience Masamha

    2016-03-01

    Full Text Available Abstract The t(11;14 translocation resulting in constitutive cyclin D1 expression is an early event in mantle cell lymphoma (MCL transformation. Patients with a highly proliferative phenotype produce cyclin D1 transcripts with truncated 3′UTRs that evade miRNA regulation. Here, we report the recurrence of a novel gene fusion in MCL cell lines and MCL patient isolates that consists of the full protein coding region of cyclin D1 (CCND1 and a 3′UTR consisting of sequences from both the CCND1 3′UTR and myotonic dystrophy kinase-related Cdc42-binding kinase's (MRCK intron one. The resulting CCND1/MRCK mRNA is resistant to CCND1-targeted miRNA regulation, and targeting the MRCK region of the chimeric 3′UTR with siRNA results in decreased CCND1 levels.

  15. Key Object Discovery and Tracking Based on Context-Aware Saliency

    Directory of Open Access Journals (Sweden)

    Geng Zhang

    2013-01-01

    Full Text Available In this paper, we propose an online key object discovery and tracking system based on visual saliency. We formulate the problem as a temporally consistent binary labelling task on a conditional random field and solve it by using a particle filter. We also propose a context‐aware saliency measurement, which can be used to improve the accuracy of any static or dynamic saliency maps. Our refined saliency maps provide clearer indications as to where the key object lies. Based on good saliency cues, we can further segment the key object inside the resulting bounding box, considering the spatial and temporal context. We tested our system extensively on different video clips. The results show that our method has significantly improved the saliency maps and tracks the key object accurately.

  16. A Chord-based resource scheduling approach in drug discovery grid

    Institute of Scientific and Technical Information of China (English)

    Chen Shudong; Zhang Wenju; Zhang Jun; Ma Fanyuan; Shen Jianhua

    2007-01-01

    This paper presents a resource scheduling approach in grid computing environment. Using P2P technology, this novel approach call schedule dynamic grid computing resources efficiently. Grid computing resources in different domains are organized into a structured P2P overlay network. Available resource information is published in type of grid services. Task requests for computational resources are also presented aS grid services. Problem of resources scheduling is translated into services discovery. Different from central scheduling approaches that collect available resources information, this Chord-based approach forwards task requests in the overlay network and discovers satisfied resources for these tasks. Using this approach, the computational resources of a grid system can be scheduled dynamically according to the real-time workload on each peer. Furthermore, the application of this approach is introduced into DDG, a grid system for drug discovery and design, to evaluate the performance. Experimental results show that computational resources of a grid system can be managed efficiently, and the system can hold a perfect load balance state and robustness.

  17. Computational medicinal chemistry in fragment-based drug discovery: what, how and when.

    Science.gov (United States)

    Rabal, Obdulia; Urbano-Cuadrado, Manuel; Oyarzabal, Julen

    2011-01-01

    The use of fragment-based drug discovery (FBDD) has increased in the last decade due to the encouraging results obtained to date. In this scenario, computational approaches, together with experimental information, play an important role to guide and speed up the process. By default, FBDD is generally considered as a constructive approach. However, such additive behavior is not always present, therefore, simple fragment maturation will not always deliver the expected results. In this review, computational approaches utilized in FBDD are reported together with real case studies, where applicability domains are exemplified, in order to analyze them, and then, maximize their performance and reliability. Thus, a proper use of these computational tools can minimize misleading conclusions, keeping the credit on FBDD strategy, as well as achieve higher impact in the drug-discovery process. FBDD goes one step beyond a simple constructive approach. A broad set of computational tools: docking, R group quantitative structure-activity relationship, fragmentation tools, fragments management tools, patents analysis and fragment-hopping, for example, can be utilized in FBDD, providing a clear positive impact if they are utilized in the proper scenario - what, how and when. An initial assessment of additive/non-additive behavior is a critical point to define the most convenient approach for fragments elaboration.

  18. Novel Routing Protocol Based on Periodic Route Discovery for Mobile Adhoc Networks

    Directory of Open Access Journals (Sweden)

    V.Jai KumarAssociate

    2016-12-01

    Full Text Available A group of mobile devices called as nodes, without any centralized network, communicates with each other over multi-hop links is called as an Ad-hoc Network (MANET. The military battle-field scenarios, Post-disaster rescue efforts, sensor networks, and entrepreneurs in a conference are some of the examples of mobile ad-hoc networks. Since there is no infrastructure in the network, the routing should be handled at every node. To improve the life time of network different routing protocols are consider. In present routing protocols of ad hoc networks, routing is an act of moving information from a source to destination in an internetwork. Route is selected in the route discovery phase until all the packets are sent out. Due to the continuous flow of packets in a selected route leads to the route failure. In order to reduce this problem we consider PRD-based MMBCR and considering the percentage of the optimum value for periodic route discovery. In our research we are going to analyze the performance of different routing protocols like DSR, MMBCR to get maximum optimum value using Network Simulator Software.

  19. Developing computer-based training programs for basic mammalian histology: Didactic versus discovery-based design

    Science.gov (United States)

    Fabian, Henry Joel

    Educators have long tried to understand what stimulates students to learn. The Swiss psychologist and zoologist, Jean Claude Piaget, suggested that students are stimulated to learn when they attempt to resolve confusion. He reasoned that students try to explain the world with the knowledge they have acquired in life. When they find their own explanations to be inadequate to explain phenomena, students find themselves in a temporary state of confusion. This prompts students to seek more plausible explanations. At this point, students are primed for learning (Piaget 1964). The Piagetian approach described above is called learning by discovery. To promote discovery learning, a teacher must first allow the student to recognize his misconception and then provide a plausible explanation to replace that misconception (Chinn and Brewer 1993). One application of this method is found in the various learning cycles, which have been demonstrated to be effective means for teaching science (Renner and Lawson 1973, Lawson 1986, Marek and Methven 1991, and Glasson & Lalik 1993). In contrast to the learning cycle, tutorial computer programs are generally not designed to correct student misconceptions, but rather follow a passive, didactic method of teaching. In the didactic or expositional method, the student is told about a phenomenon, but is neither encouraged to explore it, nor explain it in his own terms (Schneider and Renner 1980).

  20. Identification and Validation of HCC-specific Gene Transcriptional Signature for Tumor Antigen Discovery.

    Science.gov (United States)

    Petrizzo, Annacarmen; Caruso, Francesca Pia; Tagliamonte, Maria; Tornesello, Maria Lina; Ceccarelli, Michele; Costa, Valerio; Aprile, Marianna; Esposito, Roberta; Ciliberto, Gennaro; Buonaguro, Franco M; Buonaguro, Luigi

    2016-07-08

    A novel two-step bioinformatics strategy was applied for identification of signatures with therapeutic implications in hepatitis-associated HCC. Transcriptional profiles from HBV- and HCV-associated HCC samples were compared with non-tumor liver controls. Resulting HCC modulated genes were subsequently compared with different non-tumor tissue samples. Two related signatures were identified, namely "HCC-associated" and "HCC-specific". Expression data were validated by RNA-Seq analysis carried out on unrelated HCC samples and protein expression was confirmed according to The Human Protein Atlas" (http://proteinatlas.org/), a public repository of immunohistochemistry data. Among all, aldo-keto reductase family 1 member B10, and IGF2 mRNA-binding protein 3 were found strictly HCC-specific with no expression in 18/20 normal tissues. Target peptides for vaccine design were predicted for both proteins associated with the most prevalent HLA-class I and II alleles. The described novel strategy showed to be feasible for identification of HCC-specific proteins as highly potential target for HCC immunotherapy.

  1. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    .8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.

  2. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    Science.gov (United States)

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    -membrane transport of lignin, cutin, and suberin precursors suggests that the transport process could also affect the composition of exocarp and take a role in the regulation of exocarp pigmentation. Results from this study provide a base for the analysis of the molecular mechanism underlying sand pear russet/green exocarp mutation, and presents a comprehensive list of candidate genes that could be used to further investigate the trait mutation at the molecular level.

  3. Quantum dot-based screening system for discovery of g protein-coupled receptor agonists.

    Science.gov (United States)

    Lee, Junghan; Kwon, Yong-Jun; Choi, Youngseon; Kim, Hi Chul; Kim, Keumhyun; Kim, JinYeop; Park, Sun; Song, Rita

    2012-07-09

    Cellular imaging has emerged as an important tool to unravel biological complexity and to accelerate the drug-discovery process, including cell-based screening, target identification, and mechanism of action studies. Recently, semiconductor nanoparticles known as quantum dots (QDs) have attracted great interest in cellular imaging applications due to their unique photophysical properties such as size, tunable optical property, multiplexing capability, and photostability. Herein, we show that QDs can also be applied to assay development and eventually to high-throughput/content screening (HTS/HCS) for drug discovery. We have synthesized QDs modified with PEG and primary antibodies to be used as fluorescent probes for a cell-based HTS system. The G protein-coupled receptor (GPCR) family is known to be involved in most major diseases. We therefore constructed human osteosarcoma (U2OS) cells that specifically overexpress two types of differently tagged GPCRs: influenza hemagglutinin (HA) peptide-tagged κ-opioid receptors (κ-ORs) and GFP-tagged A3 adenosine receptors (A3AR). In this study, we have demonstrated that 1) anti-HA antibody-conjugated QDs could specifically label HA-tagged κ-ORs, 2) subsequent treatment of QD-tagged GPCR agonists allowed agonist-induced translocation to be monitored in real time, 3) excellent emission spectral properties of QD permitted the simultaneous detection of two GPCRs in one cell, and 4) the robust imaging capabilities of the QD-antibody conjugates could lead to reproducible quantitative data from high-content cellular images. These results suggest that the present QD-based GPCR inhibitor screening system can be a promising platform for further drug screening applications.

  4. The fragile x mental retardation syndrome 20 years after the FMR1 gene discovery: an expanding universe of knowledge.

    Science.gov (United States)

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-08-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations.

  5. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

    Science.gov (United States)

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.

  6. The genetic heterogeneity of colorectal cancer predisposition - guidelines for gene discovery

    NARCIS (Netherlands)

    Hahn, M.M.; Voer, R.M. de; Hoogerbrugge, N.; Ligtenberg, M.J.L.; Kuiper, R.P.; Kessel, A.G. van

    2016-01-01

    BACKGROUND: Colorectal cancer (CRC) is a cumulative term applied to a clinically and genetically heterogeneous group of neoplasms that occur in the bowel. Based on twin studies, up to 45 % of the CRC cases may involve a heritable component. Yet, only in 5-10 % of these cases high-penetrant germline

  7. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    Directory of Open Access Journals (Sweden)

    Huie Li

    Full Text Available Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S

  8. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    Science.gov (United States)

    Li, Huie; Yao, Weijie; Fu, Yaru; Li, Shaoke; Guo, Qiqiang

    2015-01-01

    Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S. moorcroftiana transcriptome

  9. International Astronomical Search Collaboration: An Online Student-Based Discovery Program in Astronomy (Invited)

    Science.gov (United States)

    Pennypacker, C.; Miller, P.

    2009-12-01

    The past 15 years has seen the development of affordable small telescopes, advanced digital cameras, high speed Internet access, and widely-available image analysis software. With these tools it is possible to provide student programs where they make original astronomical discoveries. High school aged students, even younger, have discovered Main Belt asteroids (MBA), near-Earth objects (NEO), comets, supernovae, and Kuiper Belt objects (KBO). Student-based discovery is truly an innovative way to generate enthusiasm for learning science. The International Astronomical Search Collaboration (IASC = “Isaac”) is an online program where high school and college students make original MBA discoveries and important NEO observations. MBA discoveries are reported to the Minor Planet Center (Harvard) and International Astronomical Union. The NEO observations are included as part of the NASA Near-Earth Object Program (JPL). Provided at no cost to participating schools, IASC is centered at Hardin-Simmons University (Abilene, TX). It is a collaboration of the University, Lawrence Hall of Science (University of California, Berkeley), Astronomical Research Institute (ARI; Charleston, IL), Global Hands-On Universe Association (Portugal),and Astrometrica (Austria). Started in Fall 2006, IASC has reached 135 schools in 14 countries. There are 9 campaigns per year, each with 15 schools and lasting 45 days. Students have discovered 150 MBAs and made > 1,000 NEO observations. One notable discovery was 2009 BD81, discovered by two high school teachers and a graduate student at the Bulgarian Academy of Science. This object, about the size of 3 football fields, crosses Earth’s orbit and poses a serious impact risk. Each night with clear skies and no Moon, the ARI Observatory uses its 24" and 32" prime focus telescopes to take images along the ecliptic. Three images are taken of the same field of view (FOV) over a period of 30 minutes. These are bundled together and placed online at

  10. FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

    Directory of Open Access Journals (Sweden)

    Breno C. Costa

    2013-11-01

    Full Text Available Nowadays the electric utilities have to handle problems with the non-technical losses caused by frauds and thefts committed by some of their consumers. In order to minimize this, some methodologies have been created to perform the detection of consumers that might be fraudsters. In this context, the use of classification techniques can improve the hit rate of the fraud detection and increase the financial income. This paper proposes the use of the knowledge-discovery in databases process based on artificial neural networks applied to the classifying process of consumers to be inspected. An experiment performed in a Brazilian electric power distribution company indicated an improvement of over 50% of the proposed approach if compared to the previous methods used by that company.

  11. An Automated Microscale Thermophoresis Screening Approach for Fragment-Based Lead Discovery.

    Science.gov (United States)

    Linke, Pawel; Amaning, Kwame; Maschberger, Melanie; Vallee, Francois; Steier, Valerie; Baaske, Philipp; Duhr, Stefan; Breitsprecher, Dennis; Rak, Alexey

    2016-04-01

    Fragment-based lead discovery has proved to be an effective alternative to high-throughput screenings in identifying chemical matter that can be developed into robust lead compounds. The search for optimal combinations of biophysical techniques that can correctly and efficiently identify and quantify binding can be challenging due to the physicochemical properties of fragments. In order to minimize the time and costs of screening, optimal combinations of biophysical techniques with maximal information content, sensitivity, and robustness are needed. Here we describe an approach utilizing automated microscale thermophoresis (MST) affinity screening to identify fragments active against MEK1 kinase. MST identified multiple hits that were confirmed by X-ray crystallography but not detected by orthogonal methods. Furthermore, MST also provided information about ligand-induced aggregation and protein denaturation. The technique delivered a large number of binders while reducing experimentation time and sample consumption, demonstrating the potential of MST to execute and maximize the efficacy of fragment screening campaigns.

  12. Predicting high-throughput screening results with scalable literature-based discovery methods.

    Science.gov (United States)

    Cohen, T; Widdows, D; Stephan, C; Zinner, R; Kim, J; Rindflesch, T; Davies, P

    2014-10-08

    The identification of new therapeutic uses for existing agents has been proposed as a means to mitigate the escalating cost of drug development. A common approach to such repurposing involves screening libraries of agents for activities against cell lines. In silico methods using knowledge from the biomedical literature have been proposed to constrain the costs of screening by identifying agents that are likely to be effective a priori. However, results obtained with these methods are seldom evaluated empirically. Conversely, screening experiments have been criticized for their inability to reveal the biological basis of their results. In this paper, we evaluate the ability of a scalable literature-based approach, discovery-by-analogy, to identify a small number of active agents within a large library screened for activity against prostate cancer cells. The methods used permit retrieval of the knowledge used to infer their predictions, providing a plausible biological basis for predicted activity.

  13. REALIZING THE NEED FOR SIMILARITY BASED REASONING OF CLOUD SERVICE DISCOVERY

    Directory of Open Access Journals (Sweden)

    S. BHAMA

    2011-12-01

    Full Text Available With the growing abundance of information on the web, it becomes the need of the hour to enrich data with semantics that can be understood and processed by machines. Currently, much of the effort in the area of semantics is focused on the representation of semantic data and its reasoning, which is the processing of semantic information associated with that data. This paper aims at realizing the need for similarity based reasoning of cloud service discovery. It forms a basic requirement of a cloud client to discover the most appropriate cloud service from the list of available services published by service providers. Cloud ontology provides a set of concepts, individuals and relationships among them. The similarity among cloud services can be determined from the semantic similarity of concepts and hence the relevant service can be retrieved.

  14. ERBB receptors: from oncogene discovery to basic science to mechanism-based cancer therapeutics.

    Science.gov (United States)

    Arteaga, Carlos L; Engelman, Jeffrey A

    2014-03-17

    ERBB receptors were linked to human cancer pathogenesis approximately three decades ago. Biomedical investigators have since developed substantial understanding of the biology underlying the dependence of cancers on aberrant ERBB receptor signaling. An array of cancer-associated genetic alterations in ERBB receptors has also been identified. These findings have led to the discovery and development of mechanism-based therapies targeting ERBB receptors that have improved outcome for many cancer patients. In this Perspective, we discuss current paradigms of targeting ERBB receptors with cancer therapeutics and our understanding of mechanisms of action and resistance to these drugs. As current strategies still have limitations, we also discuss challenges and opportunities that lie ahead as basic scientists and clinical investigators work toward more breakthroughs.

  15. 基于片段的药物发现%Fragment-based drug discovery

    Institute of Scientific and Technical Information of China (English)

    东圆珍; 冯军

    2011-01-01

    为了增加新药发现、研究的效率,科学家们一直致力于寻找新的药物设计和药物筛选方法.基于片断的药物发现(FBDD)为药物设计提供了一种新的选择.本文综述FBDD的过程、所采用的技术方法以及目前国外研究的主要成果.%In order to enhance the efficiency of new drug research and development, scientists are looking for new approach on drug design and screening. Fragment-based drug discovery (FBDD) provides a new choice for drug design.This review describes the process, the technology used, and the current research status of FBDD.

  16. A review of Fuzzy Based QoS Web Service Discovery

    Directory of Open Access Journals (Sweden)

    R.Buvanesvari

    2013-03-01

    Full Text Available Recently, web service has become an important issue for developers. Selecting a specific service is a crucial task. Some approaches develop extensive description and publication mechanisms while others use syntactic, semantic, and structural reviews of Web service specifications. It is very crucial for finding the most suitable web service from a large collection of web services for successful execution of applications. In many cases, the value of a QoS property may not be precisely defined. Recently, fuzzy is considered as the dominant approaches in Web services which can deal with fuzzy constraints have been proposed. Therefore fuzzy logic can be applied to support for representing such imprecise QoS constraints. In this paper, we will present an overview which focus on developing fuzzy-based approach for Web service discovery. This paper also describes the web service challenges on fuzzy mechanism that summarized and analyzed in order to assess their benefits and limitations.

  17. A Framework for Automatic Web Service Discovery Based on Semantics and NLP Techniques

    OpenAIRE

    Asma Adala; Nabil Tabbane; Sami Tabbane

    2011-01-01

    As a greater number of Web Services are made available today, automatic discovery is recognized as an important task. To promote the automation of service discovery, different semantic languages have been created that allow describing the functionality of services in a machine interpretable form using Semantic Web technologies. The problem is that users do not have intimate knowledge about semantic Web service languages and related toolkits. In this paper, we propose a discovery framework tha...

  18. Knowledge-based discovery for designing CRISPR-CAS systems against invading mobilomes in thermophiles.

    Science.gov (United States)

    Chellapandi, P; Ranjani, J

    2015-09-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) are direct features of the prokaryotic genomes involved in resistance to their bacterial viruses and phages. Herein, we have identified CRISPR loci together with CRISPR-associated sequences (CAS) genes to reveal their immunity against genome invaders in the thermophilic archaea and bacteria. Genomic survey of this study implied that genomic distribution of CRISPR-CAS systems was varied from strain to strain, which was determined by the degree of invading mobiloms. Direct repeats found to be equal in some extent in many thermopiles, but their spacers were differed in each strain. Phylogenetic analyses of CAS superfamily revealed that genes cmr, csh, csx11, HD domain, devR were belonged to the subtypes of cas gene family. The members in cas gene family of thermophiles were functionally diverged within closely related genomes and may contribute to develop several defense strategies. Nevertheless, genome dynamics, geological variation and host defense mechanism were contributed to share their molecular functions across the thermophiles. A thermophilic archaean, Thermococcus gammotolerans and thermophilic bacteria, Petrotoga mobilis and Thermotoga lettingae have shown superoperons-like appearance to cluster cas genes, which were typically evolved for their defense pathways. A cmr operon was identified with a specific promoter in a thermophilic archaean, Caldivirga maquilingensis. Overall, we concluded that knowledge-based genomic survey and phylogeny-based functional assignment have suggested for designing a reliable genetic regulatory circuit naturally from CRISPR-CAS systems, acquired defense pathways, to thermophiles in future synthetic biology.

  19. 1-Mb resolution array-based comparative genomic hybridization using a BAC clone set optimized for cancer gene analysis

    NARCIS (Netherlands)

    Greshock, J; Naylor, TL; Margolin, A; Diskin, S; Cleaver, SH; Futreal, PA; deJong, PJ; Zhao, SY; Liebman, M; Weber, BL

    2004-01-01

    Array-based comparative genomic hybridization (aCGH) is a recently developed tool for genome-wide determination of DNA copy number alterations. This technology has tremendous potential for disease-gene discovery in cancer and developmental disorders as well as numerous other applications. However, w

  20. Computational Materials Science and Chemistry: Accelerating Discovery and Innovation through Simulation-Based Engineering and Science

    Energy Technology Data Exchange (ETDEWEB)

    Crabtree, George [Argonne National Lab. (ANL), Argonne, IL (United States); Glotzer, Sharon [University of Michigan; McCurdy, Bill [University of California Davis; Roberto, Jim [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2010-07-26

    This report is based on a SC Workshop on Computational Materials Science and Chemistry for Innovation on July 26-27, 2010, to assess the potential of state-of-the-art computer simulations to accelerate understanding and discovery in materials science and chemistry, with a focus on potential impacts in energy technologies and innovation. The urgent demand for new energy technologies has greatly exceeded the capabilities of today's materials and chemical processes. To convert sunlight to fuel, efficiently store energy, or enable a new generation of energy production and utilization technologies requires the development of new materials and processes of unprecedented functionality and performance. New materials and processes are critical pacing elements for progress in advanced energy systems and virtually all industrial technologies. Over the past two decades, the United States has developed and deployed the world's most powerful collection of tools for the synthesis, processing, characterization, and simulation and modeling of materials and chemical systems at the nanoscale, dimensions of a few atoms to a few hundred atoms across. These tools, which include world-leading x-ray and neutron sources, nanoscale science facilities, and high-performance computers, provide an unprecedented view of the atomic-scale structure and dynamics of materials and the molecular-scale basis of chemical processes. For the first time in history, we are able to synthesize, characterize, and model materials and chemical behavior at the length scale where this behavior is controlled. This ability is transformational for the discovery process and, as a result, confers a significant competitive advantage. Perhaps the most spectacular increase in capability has been demonstrated in high performance computing. Over the past decade, computational power has increased by a factor of a million due to advances in hardware and software. This rate of improvement, which shows no sign of

  1. Effect of Similarity-Based Guided Discovery Learning on Conceptual Performance

    Science.gov (United States)

    Mandrin, Pierre-A; Preckel, Daniel

    2009-01-01

    Analogies are known to foster concept learning, whereas discovery learning is effective for transfer. By combining discovery learning and analogies or similarities of concepts, attractive new arrangements emerge, but do they maintain both concept and transfer effects? Unfortunately, there is a lack of data confirming such combined effectiveness.…

  2. Chemiluminescent detection of sequential DNA hybridizations to high-density, filter-arrayed cDNA libraries: a subtraction method for novel gene discovery.

    Science.gov (United States)

    Guiliano, D; Ganatra, M; Ware, J; Parrot, J; Daub, J; Moran, L; Brennecke, H; Foster, J M; Supali, T; Blaxter, M; Scott, A L; Williams, S A; Slatko, B E

    1999-07-01

    A chemiluminescent approach for sequential DNA hybridizations to high-density filter arrays of cDNAs, using a biotin-based random priming method followed by a streptavidin/alkaline phosphatase/CDP-Star detection protocol, is presented. The method has been applied to the Brugia malayi genome project, wherein cDNA libraries, cosmid and bacterial artificial chromosome (BAC) libraries have been gridded at high density onto nylon filters for subsequent analysis by hybridization. Individual probes and pools of rRNA probes, ribosomal protein probes and expressed sequence tag probes show correct specificity and high signal-to-noise ratios even after ten rounds of hybridization, detection, stripping of the probes from the membranes and rehybridization with additional probe sets. This approach provides a subtraction method that leads to a reduction in redundant DNA sequencing, thus increasing the rate of novel gene discovery. The method is also applicable for detecting target sequences, which are present in one or only a few copies per cell; it has proven useful for physical mapping of BAC and cosmid high-density filter arrays, wherein multiple probes have been hybridized at one time (multiplexed) and subsequently "deplexed" into individual components for specific probe localizations.

  3. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes.

    Science.gov (United States)

    Ahvar, Ehsan; Lee, Gyu Myoung; Han, Son N; Crespi, Noel; Khan, Imran

    2016-06-27

    User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD) will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i) users' privacy; (ii) device-/tag-free; and (iii) fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation.

  4. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes

    Directory of Open Access Journals (Sweden)

    Ehsan Ahvar

    2016-06-01

    Full Text Available User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i users’ privacy; (ii device-/tag-free; and (iii fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation.

  5. Assessment of cardiovascular risk based on a data-driven knowledge discovery approach.

    Science.gov (United States)

    Mendes, D; Paredes, S; Rocha, T; Carvalho, P; Henriques, J; Cabiddu, R; Morais, J

    2015-01-01

    The cardioRisk project addresses the development of personalized risk assessment tools for patients who have been admitted to the hospital with acute myocardial infarction. Although there are models available that assess the short-term risk of death/new events for such patients, these models were established in circumstances that do not take into account the present clinical interventions and, in some cases, the risk factors used by such models are not easily available in clinical practice. The integration of the existing risk tools (applied in the clinician's daily practice) with data-driven knowledge discovery mechanisms based on data routinely collected during hospitalizations, will be a breakthrough in overcoming some of these difficulties. In this context, the development of simple and interpretable models (based on recent datasets), unquestionably will facilitate and will introduce confidence in this integration process. In this work, a simple and interpretable model based on a real dataset is proposed. It consists of a decision tree model structure that uses a reduced set of six binary risk factors. The validation is performed using a recent dataset provided by the Portuguese Society of Cardiology (11113 patients), which originally comprised 77 risk factors. A sensitivity, specificity and accuracy of, respectively, 80.42%, 77.25% and 78.80% were achieved showing the effectiveness of the approach.

  6. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes

    Science.gov (United States)

    Ahvar, Ehsan; Lee, Gyu Myoung; Han, Son N.; Crespi, Noel; Khan, Imran

    2016-01-01

    User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD) will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i) users’ privacy; (ii) device-/tag-free; and (iii) fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation. PMID:27355951

  7. Mobile STEMship Discovery Center: K-12 Aerospace-Based Science, Technology, Engineering, and Mathematics (STEM) Mobile Teaching Vehicle

    Science.gov (United States)

    2015-08-03

    reporting period: Changes in research objectives (if any): Change in AFOSR Program Manager, if any: Extensions granted or milestones slipped, if any: AFOSR...LRIR Number LRIR Title Reporting Period Laboratory Task Manager Program Officer Research Objectives Technical Summary Funding Summary by Cost...AND SUBTITLE Mobile STEMship Discovery Center: K-12 Aerospace-Based Science, Technology, Engineering, and Mathematics (STEM) Mobile Teaching Vehicle

  8. The Goal Specificity Effect on Strategy Use and Instructional Efficiency during Computer-Based Scientific Discovery Learning

    Science.gov (United States)

    Kunsting, Josef; Wirth, Joachim; Paas, Fred

    2011-01-01

    Using a computer-based scientific discovery learning environment on buoyancy in fluids we investigated the "effects of goal specificity" (nonspecific goals vs. specific goals) for two goal types (problem solving goals vs. learning goals) on "strategy use" and "instructional efficiency". Our empirical findings close an important research gap,…

  9. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system.

  10. Computational Materials Science and Chemistry: Accelerating Discovery and Innovation through Simulation-Based Engineering and Science

    Energy Technology Data Exchange (ETDEWEB)

    Crabtree, George [Argonne National Lab. (ANL), Argonne, IL (United States); Glotzer, Sharon [University of Michigan; McCurdy, Bill [University of California Davis; Roberto, Jim [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2010-07-26

    This report is based on a SC Workshop on Computational Materials Science and Chemistry for Innovation on July 26-27, 2010, to assess the potential of state-of-the-art computer simulations to accelerate understanding and discovery in materials science and chemistry, with a focus on potential impacts in energy technologies and innovation. The urgent demand for new energy technologies has greatly exceeded the capabilities of today's materials and chemical processes. To convert sunlight to fuel, efficiently store energy, or enable a new generation of energy production and utilization technologies requires the development of new materials and processes of unprecedented functionality and performance. New materials and processes are critical pacing elements for progress in advanced energy systems and virtually all industrial technologies. Over the past two decades, the United States has developed and deployed the world's most powerful collection of tools for the synthesis, processing, characterization, and simulation and modeling of materials and chemical systems at the nanoscale, dimensions of a few atoms to a few hundred atoms across. These tools, which include world-leading x-ray and neutron sources, nanoscale science facilities, and high-performance computers, provide an unprecedented view of the atomic-scale structure and dynamics of materials and the molecular-scale basis of chemical processes. For the first time in history, we are able to synthesize, characterize, and model materials and chemical behavior at the length scale where this behavior is controlled. This ability is transformational for the discovery process and, as a result, confers a significant competitive advantage. Perhaps the most spectacular increase in capability has been demonstrated in high performance computing. Over the past decade, computational power has increased by a factor of a million due to advances in hardware and software. This rate of improvement, which shows no sign of

  11. Transcriptome profiling of the testis reveals genes involved in spermatogenesis and marker discovery in the oriental fruit fly, Bactrocera dorsalis.

    Science.gov (United States)

    Wei, D; Li, H-M; Yang, W-J; Wei, D-D; Dou, W; Huang, Y; Wang, J-J

    2015-02-01

    The testis is a highly specialized tissue that plays a vital role in ensuring fertility by producing spermatozoa, which are transferred to the female during mating. Spermatogenesis is a complex process, resulting in the production of mature sperm, and involves significant structural and biochemical changes in the seminiferous epithelium of the adult testis. The identification of genes involved in spermatogenesis of Bactrocera dorsalis (Hendel) is critical for a better understanding of its reproductive development. In this study, we constructed a cDNA library of testes from male B. dorsalis adults at different ages, and performed de novo transcriptome sequencing to produce a comprehensive transcript data set, using Illumina sequencing technology. The analysis yielded 52 016 732 clean reads, including a total of 4.65 Gb of nucleotides. These reads were assembled into 47 677 contigs (average 443 bp) and then clustered into 30 516 unigenes (average 756 bp). Based on BLAST hits with known proteins in different databases, 20 921 unigenes were annotated with a cut-off E-value of 10(-5). The transcriptome sequences were further annotated using the Clusters of Orthologous Groups, Gene Orthology and the Kyoto Encyclopedia of Genes and Genomes databases. Functional genes involved in spermatogenesis were analysed, including cell cycle proteins, metalloproteins, actin, and ubiquitin and antihyperthermia proteins. Several testis-specific genes were also identified. The transcripts database will help us to understand the molecular mechanisms underlying spermatogenesis in B. dorsalis. Furthermore, 2913 simple sequence repeats and 151 431 single nucleotide polymorphisms were identified, which will be useful for investigating the genetic diversity of B. dorsalis in the future.

  12. De novo transcriptomic analysis of an oleaginous microalga: pathway description and gene discovery for production of next-generation biofuels.

    Directory of Open Access Journals (Sweden)

    LingLin Wan

    Full Text Available BACKGROUND: Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. RESULTS: We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. CONCLUSIONS: Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  13. Enriching regulatory networks by bootstrap learning using optimised GO-based gene similarity and gene links mined from PubMed abstracts

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, Ronald C.; Sanfilippo, Antonio P.; McDermott, Jason E.; Baddeley, Robert L.; Riensche, Roderick M.; Jensen, Russell S.; Verhagen, Marc; Pustejovsky, James

    2011-02-18

    Transcriptional regulatory networks are being determined using “reverse engineering” methods that infer connections based on correlations in gene state. Corroboration of such networks through independent means such as evidence from the biomedical literature is desirable. Here, we explore a novel approach, a bootstrapping version of our previous Cross-Ontological Analytic method (XOA) that can be used for semi-automated annotation and verification of inferred regulatory connections, as well as for discovery of additional functional relationships between the genes. First, we use our annotation and network expansion method on a biological network learned entirely from the literature. We show how new relevant links between genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. Second, we apply our method to annotation, verification, and expansion of a set of regulatory connections found by the Context Likelihood of Relatedness algorithm.

  14. Analysis of an inactive cyanobactin biosynthetic gene cluster leads to discovery of new natural products from strains of the genus Microcystis.

    Directory of Open Access Journals (Sweden)

    Niina Leikoski

    Full Text Available Cyanobactins are cyclic peptides assembled through the cleavage and modification of short precursor proteins. An inactive cyanobactin gene cluster has been described from the genome Microcystis aeruginosa NIES843. Here we report the discovery of active counterparts in strains of the genus Microcystis guided by this silent cyanobactin gene cluster. The end products of the gene clusters were structurally diverse cyclic peptides, which we named piricyclamides. Some of the piricyclamides consisted solely of proteinogenic amino acids while others contained disulfide bridges and some were prenylated or geranylated. The piricyclamide gene clusters encoded between 1 and 4 precursor genes. They encoded highly diverse core peptides ranging in length from 7-17 amino acids with just a single conserved amino acid. Heterologous expression of the pir gene cluster from Microcystis aeruginosa PCC7005 in Escherichia coli confirmed that this gene cluster is responsible for the biosynthesis of piricyclamides. Chemical analysis demonstrated that Microcystis strains could produce an array of piricyclamides some of which are geranylated or prenylated. The genetic diversity of piricyclamides in a bloom sample was explored and 19 different piricyclamide precursor genes were found. This study provides evidence for a stunning array of piricyclamides in Microcystis, a worldwide occurring bloom forming cyanobacteria.

  15. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    Science.gov (United States)

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships.

  16. A gene-based information gain method for detecting gene-gene interactions in case-control studies.

    Science.gov (United States)

    Li, Jin; Huang, Dongli; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Jiang, Yongshuai; Lv, Hongchao; Wang, Limei

    2015-11-01

    Currently, most methods for detecting gene-gene interactions (GGIs) in genome-wide association studies are divided into SNP-based methods and gene-based methods. Generally, the gene-based methods can be more powerful than SNP-based methods. Some gene-based entropy methods can only capture the linear relationship between genes. We therefore proposed a nonparametric gene-based information gain method (GBIGM) that can capture both linear relationship and nonlinear correlation between genes. Through simulation with different odds ratio, sample size and prevalence rate, GBIGM was shown to be valid and more powerful than classic KCCU method and SNP-based entropy method. In the analysis of data from 17 genes on rheumatoid arthritis, GBIGM was more effective than the other two methods as it obtains fewer significant results, which was important for biological verification. Therefore, GBIGM is a suitable and powerful tool for detecting GGIs in case-control studies.

  17. Structure based discovery of small molecules to regulate the activity of human insulin degrading enzyme.

    Directory of Open Access Journals (Sweden)

    Bilal Çakir

    Full Text Available BACKGROUND: Insulin-degrading enzyme (IDE is an allosteric Zn(+2 metalloprotease involved in the degradation of many peptides including amyloid-β, and insulin that play key roles in Alzheimer's disease (AD and type 2 diabetes mellitus (T2DM, respectively. Therefore, the use of therapeutic agents that regulate the activity of IDE would be a viable approach towards generating pharmaceutical treatments for these diseases. Crystal structure of IDE revealed that N-terminal has an exosite which is ∼30 Å away from the catalytic region and serves as a regulation site by orientation of the substrates of IDE to the catalytic site. It is possible to find small molecules that bind to the exosite of IDE and enhance its proteolytic activity towards different substrates. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we applied structure based drug design method combined with experimental methods to discover four novel molecules that enhance the activity of human IDE. The novel compounds, designated as D3, D4, D6, and D10 enhanced IDE mediated proteolysis of substrate V, insulin and amyloid-β, while enhanced degradation profiles were obtained towards substrate V and insulin in the presence of D10 only. CONCLUSION/SIGNIFICANCE: This paper describes the first examples of a computer-aided discovery of IDE regulators, showing that in vitro and in vivo activation of this important enzyme with small molecules is possible.

  18. ``Tools for Astrometry": A Windows-based Research Tool for Asteroid Discovery and Measurement

    Science.gov (United States)

    Snyder, G. A.; Marschall, L. A.; Good, R. F.; Hayden, M. B.; Cooper, P. R.

    1998-12-01

    We have developed a Windows-based interactive digital astrometry package with a simple, ergonomic interface, designed for the discovery, measurement, and recording of asteroid positions by individual observers. The software, "Tools For Astrometry", will handle FITS and SBIG format images up to 2048 x 2048 (or larger, depending on RAM), and provides features for blinking images or subframes of images, and measurement of positions and magnitudes against both the HST Guide Star Catalog and the USNO SA-1 catalog,. In addition, the program can calculate ephemerides from element tables, including the Lowell Asteroid Database available online, can generate charts of star-fields showing the motion of asteroids from the ephemeris superimposed against the background star field, can project motions of measured asteroids ahead several days using linear interpolation for purposes of reacquisition, and can calculate projected baselines for asteroid parallax measurements. Images, charts, and tables of ephemerides can printed as well as displayed, and reports can be generated in the standard format of the IAU Minor Planet Center. The software is designed ergonomically, and one can go from raw images to completed astrometric report in a matter of minutes. The software is an extension of software developed for introductory astronomy laboratories by Project CLEA, which is supported by grants from Gettysburg College and the National Science Foundation.

  19. Enabling Metabolomics Based Biomarker Discovery Studies Using Molecular Phenotyping of Exosome-Like Vesicles.

    Directory of Open Access Journals (Sweden)

    Tatiana Altadill

    Full Text Available Identification of sensitive and specific biomarkers with clinical and translational utility will require smart experimental strategies that would augment expanding the breadth and depth of molecular measurements within the constraints of currently available technologies. Exosomes represent an information rich matrix to discern novel disease mechanisms that are thought to contribute to pathologies such as dementia and cancer. Although proteomics and transcriptomic studies have been reported using Exosomes-Like Vesicles (ELVs from different sources, exosomal metabolome characterization and its modulation in health and disease remains to be elucidated. Here we describe methodologies for UPLC-ESI-MS based small molecule profiling of ELVs from human plasma and cell culture media. In this study, we present evidence that indeed ELVs carry a rich metabolome that could not only augment the discovery of low abundance biomarkers but may also help explain the molecular basis of disease progression. This approach could be easily translated to other studies seeking to develop predictive biomarkers that can subsequently be used with simplified targeted approaches.

  20. Streamlining Metadata Ingest and Discovery Using ECHO's REST-based API

    Science.gov (United States)

    Ericson, R.; Baynes, K.; Pilone, D.

    2012-12-01

    Enabling user access to Earth science data is a primary goal of NASA's Earth Observing System Data and Information Systems (EOSDIS) programs. NASA's Earth Observing System ClearingHOuse (ECHO) acts as the core metadata repository for EOSDIS's data centers, providing a centralized mechanism for metadata and data discovery and retrieval. ECHO has recently made strides to restructure its API; allowing data partners to streamline and synchronize their metadata ingest using RESTful web services. ECHO's legacy ingest process involves data uploads via FTP with asynchronous result reporting. Data centers provide single xml files or compressed data (zip) files that are unpacked, indexed and stored in ECHO data tables for future search and retrieval. Any problems related to metadata validation and ingest are reported after batch processing of discrete jobs have been completed. With ECHO's new REST-based web services, data providers will receive immediate feedback about the status of their ingested data and can ensure that their data exports are successful as soon as the data is posted to our repository. This presentation will introduce ECHO's potential new and existing data partners to the process of implementing data ingest via its RESTful web services API, providing real-world examples of end-to-end metadata management. Examples of ECHO's support of multi-format metadata ingest using both ECHO10 and ISO 19115 metadata formats will be showcased. This presentation will also pay special attention to tuning a provider's metadata, making it more easily searched and accessed via ECHO's various interfaces.

  1. Gun possession among American youth: a discovery-based approach to understand gun violence.

    Directory of Open Access Journals (Sweden)

    Kelly V Ruggles

    Full Text Available OBJECTIVE: To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions' Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. RESULTS: Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001-2011. Specifically, we automated odds ratio calculations for 55 risk behaviors to assemble a comprehensive table of associations for every behavior combination. Downstream analyses included the hierarchical clustering of risk behaviors based on their association "fingerprint" to 1 visualize and assess which behaviors frequently co-occur and 2 evaluate which risk behaviors are consistently found to be associated with gun possession. From these analyses, we identified more than 40 behavioral factors, including heroin use, using snuff on school property, having been injured in a fight, and having been a victim of sexual violence, that have and continue to be strongly associated with gun possession. Additionally, we identified six behavioral clusters based on association similarities: 1 physical activity and nutrition; 2 disordered eating, suicide and sexual violence; 3 weapon carrying and physical safety; 4 alcohol, marijuana and cigarette use; 5 drug use on school property and 6 overall drug use. CONCLUSIONS: Use of computational methodologies identified multiple risk behaviors, beyond more commonly discussed indicators of poor mental health, that are associated with gun possession among youth. Implications for prevention efforts and future interdisciplinary work applying computational methods to behavioral science data are described.

  2. Immunophenotype Discovery, Hierarchical Organization, and Template-based Classification of Flow Cytometry Samples

    Directory of Open Access Journals (Sweden)

    Ariful Azad

    2016-08-01

    Full Text Available We describe algorithms for discovering immunophenotypes from large collections of flow cytometry (FC samples, and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters, a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples, while ignoring noise and small sample-specific variations.We have applied the template-base scheme to analyze several data setsincluding one representing a healthy immune system, and one of Acute Myeloid Leukemia (AMLsamples. The last task is challenging due to the phenotypic heterogeneity of the severalsubtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML, and were able to distinguish Acute Promyelocytic Leukemia from other subtypes of AML.

  3. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT.

    Directory of Open Access Journals (Sweden)

    Tiffany A Timbers

    2016-08-01

    Full Text Available Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT, to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing, development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS. WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy.

  4. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Directory of Open Access Journals (Sweden)

    Luo Ming-Cheng

    2011-01-01

    Full Text Available Abstract Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA

  5. Guided Discoveries.

    Science.gov (United States)

    Ehrlich, Amos

    1991-01-01

    Presented are four mathematical discoveries made by students on an arithmetical function using the Fibonacci sequence. Discussed is the nature of the role of the teacher in directing the students' discovery activities. (KR)

  6. Xenogenomics: Genomic Bioprospecting in Indigenous and Exotic Plants Through EST Discovery, cDNA Microarray-Based Expression Profiling and Functional Genomics

    Directory of Open Access Journals (Sweden)

    German C. Spangenberg

    2006-04-01

    Full Text Available To date, the overwhelming majority of genomics programs in plants have been directed at model or crop plant species, meaning that very little of the naturally occurring sequence diversity found in plants is available for characterization and exploitation. In contrast, ‘xenogenomics’ refers to the discovery and functional analysis of novel genes and alleles from indigenous and exotic species, permitting bioprospecting of biodiversity using high-throughput genomics experimental approaches. Such a program has been initiated to bioprospect for genetic determinants of abiotic stress tolerance in indigenous Australian flora and native Antarctic plants. Uniquely adapted Poaceae and Fabaceae species with enhanced tolerance to salt, drought, elevated soil aluminium concentration, and freezing stress have been identified, based primarily on their eco-physiology, and have been subjected to structural and functional genomics analyses. For each species, EST collections have been derived from plants subjected to appropriate abiotic stresses. Transcript profiling with spotted unigene cDNA micro-arrays has been used to identify genes that are transcriptionally modulated in response to abiotic stress. Candidate genes identified on the basis of sequence annotation or transcript profiling have been assayed in planta and other in vivo systems for their capacity to confer novel phenotypes. Comparative genomics analysis of novel genes and alleles identified in the xenogenomics target plant species has subsequently been undertaken with reference to key model and crop plants.

  7. MobilomeFINDER: Web-Based Tools for In Silico and Experimental Discovery of Bacterial Genomic Islands

    OpenAIRE

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Hinton, Jay C. D.; Barer, Michael R.; Deng, Zixin; Rajakumar, Kumar; Lory, Stephen

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’....

  8. Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery

    Directory of Open Access Journals (Sweden)

    Garner Harold R

    2009-05-01

    Full Text Available Abstract Background Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states. Results The proposed methodology was applied to a pilot narcolepsy study using logistic regression, hierarchical clustering, t-test, and CART. Consensus, differential mass peaks with high predictive power were identified across three of the four statistical platforms. Based on the diagnostic accuracy measures investigated, the performance of the consensus-peak model was a compromise between logistic regression and CART, which produced better models than hierarchical clustering and t-test. However, consensus peaks confer a higher level of confidence in their ability to distinguish between disease states since they do not represent peaks that are a result of biases to a particular statistical algorithm. Instead, they were selected as differential across differing data distribution assumptions, demonstrating their true discriminatory potential. Conclusion The methodology described here is applicable to any high-resolution MALDI mass spectrometry-derived data set with minimal mass drift which is essential for peak-to-peak comparison studies. Four statistical approaches with differing data distribution assumptions were applied to the same raw data set to obtain consensus peaks that were found to be statistically differential between the two groups compared. These consensus peaks demonstrated high diagnostic accuracy when used to form a predictive model as evaluated by receiver operating characteristics

  9. Ensemble-Based Virtual Screening Led to the Discovery of New Classes of Potent Tyrosinase Inhibitors.

    Science.gov (United States)

    Choi, Joonhyeok; Choi, Kwang-Eun; Park, Sung Jean; Kim, Sun Yeou; Jee, Jun-Goo

    2016-02-22

    In this study, we report new classes of potent tyrosinase inhibitors identified by enhanced structure-based virtual screening prediction; the enzyme and melanin content assays were also confirmed. Tyrosinase, a type-3 copper protein, participates in two distinct reactions, hydroxylation of tyrosine to DOPA and conversion of DOPA to dopaquinone, in melanin biosynthesis. Although numerous inhibitors of this reaction have been reported, there is a lag in the discovery of the new functional moieties. In order to improve the performance of virtual screening, we first produced an ensemble of 10,000 structures using molecular dynamics simulation. Quantum mechanical calculation was used to determine the partial charges of catalytic copper ions based on the met and deoxy states. Second, we selected a structure showing an optimal receiver operating characteristic (ROC) curve with known direct binders and their physicochemically matched decoys. The structure revealed more than 10-fold higher enrichment at 1% of the ROC curve than those observed in X-ray structures. Third, high-throughput virtual screening with DOCK 3.6 was performed using a library consisting of approximately 400,000 small molecules derived from the ZINC database. Fourth, we obtained the top 60 molecules and tested their inhibition of mushroom tyrosinase. The extended assays included 21 analogs of the 21 initial hits to test their inhibition properties. Here, the moieties of tetrazole and triazole were identified as new binding cores interacting with the dicopper catalytic center. All 42 inhibitors showed inhibitory constant, Ki, values ranging from 11.1 nM and 33.4 μM, with a tetrazole compound exhibiting the strongest activity. Among the 42 molecules, five displayed more than 30% reduction in melanin production when treated in B16F10 melanoma cells; cell viability was >90% at 20 μM. Particularly, a thiosemicarbazone-containing compound reduced melanin content by 55%.

  10. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    The price discovery literature investigates how homogenous securities traded on different markets incorporate information into prices. We take this literature one step further and investigate how these markets contribute to stochastic volatility (volatility discovery). We formally show...... that the realized measures from homogenous securities share a fractional stochastic trend, which is a combination of the price and volatility discovery measures. Furthermore, we show that volatility discovery is associated with the way that market participants process information arrival (market sensitivity...

  11. Fragment-Based Discovery of 5-Arylisatin-Based Inhibitors of Matrix Metalloproteinases 2 and 13.

    Science.gov (United States)

    Agamennone, Mariangela; Belov, Dmitry S; Laghezza, Antonio; Ivanov, Vladimir N; Novoselov, Anton M; Andreev, Ivan A; Ratmanova, Nina K; Altieri, Andrea; Tortorella, Paolo; Kurkin, Alexander V

    2016-09-06

    Matrix metalloproteinases (MMPs) are well-established targets for several pathologies. In particular, MMP-2 and MMP-13 play a prominent role in cancer progression. In this study, a structure-based screening campaign was applied to prioritize metalloproteinase-oriented fragments. This computational model was applied to a representative fragment set from the publically available EDASA Scientific compound library. These fragments were prioritized, and the top-ranking hits were tested in a biological assay to validate the model. Two scaffolds showed consistent activity in the assay, and the isatin-based compounds were the most interesting. These latter fragments have significant potential as tools for the design and realization of novel MMP inhibitors. In addition to their micromolar activity, the chemical synthesis affords flexible and creative access to their analogues.

  12. Stem Cell-Based Gene Therapy.

    Science.gov (United States)

    Bagnis; Mannoni

    1997-01-01

    Many researchers and clinicians wonder if gene therapy remains a way to treat genetic or acquired life-threatening diseases. For the last few years, many experimental, pre-clinical, and clinical data have been published showing that it is possible to transfer with relatively high efficiency new genetic information (transgene) in many cells or tissues including both hematopoietic progenitor cells and differentiated cells. Based on experimental works, addition of the normal gene to cells with deletions, mutations, or alterations of the corresponding endogenous one has been shown to reverse the phenotype and to restore (in some case) the functional defect. In spite of very attractive preliminary results, however, suggesting the feasibility and safety of this process, therapeutically efficient gene transfer and expression in targeted cells or tissues must be proven. In this review, we will focus primarily on the attempts to use gene transfer in hematopoietic stem cells as a model for more general genetic manipulations of stem cells. Hematopoietic stem cells are included in a subset of bone marrow, cord blood, or peripheral blood cells identified by the expression of the CD34 antigen on their membrane.

  13. De novo transcriptome assembly of Ipomoea nil using Illumina sequencing for gene discovery and SSR marker identification.

    Science.gov (United States)

    Wei, Changhe; Tao, Xiang; Li, Ming; He, Bin; Yan, Lang; Tan, Xuemei; Zhang, Yizheng

    2015-10-01

    Ipomoea nil is widely used as an ornamental plant due to its abundance of flower color, but the limited transcriptome and genomic data hinder research on it. Using illumina platform, transcriptome profiling of I. nil was performed through high-throughput sequencing, which was proven to be a rapid and cost-effective means to characterize gene content. Our goal is to use the resulting information to facilitate the relevant research on flowering and flower color formation in I. nil. In total, 268 million unique illumina RNA-Seq reads were produced and used in the transcriptome assembly. These reads were assembled into 220,117 contigs, of which 137,307 contigs were annotated using the GO and KEGG database. Based on the result of functional annotations, a total of 89,781 contigs were assigned 455,335 GO term annotations. Meanwhile, 17,418 contigs were identified with pathway annotation and they were functionally assigned to 144 KEGG pathways. Our transcriptome revealed at least 55 contigs as probably flowering-related genes in I. nil, and we also identified 25 contigs that encode key enzymes in the phenylpropanoid biosynthesis pathway. Based on the analysis relating to gene expression profiles, in the phenylpropanoid biosynthesis pathway of I. nil, the repression of lignin biosynthesis might lead to the redirection of the metabolic flux into anthocyanin biosynthesis. This may be the most likely reason that I. nil has high anthocyanins content, especially in its flowers. Additionally, 15,537 simple sequence repeats (SSRs) were detected using the MISA software, and these SSRs will undoubtedly benefit future breeding work. Moreover, the information uncovered in this study will also serve as a valuable resource for understanding the flowering and flower color formation mechanisms in I. nil.

  14. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Directory of Open Access Journals (Sweden)

    Wynhoven Brian

    2011-03-01

    Full Text Available Abstract Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt, we have generated Expressed Sequence Tags (ESTs by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores and asexual (germinated urediniospores stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum, 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs. Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt and stripe rust, P. striiformis f. sp

  15. Tumour class prediction and discovery by microarray-based DNA methylation analysis

    OpenAIRE

    Adorján, Péter; Distler, Jürgen; Lipscher, Evelyne; Model, Fabian; Müller, Jürgen; Pelet, Cécile; Braun, Aron; Florl, Andrea R.; Gütig, David; Grabs, Gabi; Howe, André; Kursar, Mischo; Lesche, Ralf; Leu, Erik; Lewin, André

    2002-01-01

    Aberrant DNA methylation of CpG sites is among the earliest and most frequent alterations in cancer. Several studies suggest that aberrant methylation occurs in a tumour type-specific manner. However, large-scale analysis of candidate genes has so far been hampered by the lack of high throughput assays for methylation detection. We have developed the first microarray-based technique which allows genome-wide assessment of selected CpG dinucleotides as well as quantification of methylation at e...

  16. Ananalysis on Runtime Associated QOS-Based Proficient Web Services Discovery Optimization

    Directory of Open Access Journals (Sweden)

    A. Amirthasaravanan

    Full Text Available ABSTRACT In today's web world, Service-oriented architectures represent the main standard for IT infrastructures. Certainly, with the initiation of service oriented architecture, Web services have gained incredible growth. Web service discovery has become increasingly more significant as the existing use of web service. Discovering most appropriate web service from vast collection of web services is very decisive for successful execution of applications. In automation of web service discovery, there is always a need to deem Quality of Service (QoS attributes during matching. A study of literature concerning the evolution of different web service discovery optimization methods with unique prominence to quality motivated service discovery have been carried out in this work. This paper depicts Bio-inspired algorithms optimizing the discovery process for semantic web services. Bio-inspired algorithm is a metaheuristics method that mimics the nature in order to unravel optimization difficulty and evaluates the analysis of some popular bio-inspired optimization algorithm systematically. This paper also focused on the principle of each algorithm and their application with respect to run time oriented QoSattributes and from result the best suitable bio-inspired optimization algorithm is been deployed.

  17. An agent-based peer-to-peer architecture for semantic discovery of manufacturing services across virtual enterprises

    Science.gov (United States)

    Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu

    2015-04-01

    With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.

  18. Computational drug discovery

    Institute of Scientific and Technical Information of China (English)

    Si-sheng OU-YANG; Jun-yan LU; Xiang-qian KONG; Zhong-jie LIANG; Cheng LUO; Hualiang JIANG

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process.Because of the dramatic increase in the availability of biological macromolecule and small molecule information,the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow,including target identification and validation,lead discovery and optimization and preclinical tests.Over the past decades,computational drug discovery methods such as molecular docking,pharmacophore modeling and mapping,de novo design,molecular similarity calculation and sequence-based virtual screening have been greatly improved.In this review,we present an overview of these important computational methods,platforms and successful applications in this field.

  19. Practice-Based Knowledge Discovery for Comparative Effectiveness Research: An Organizing Framework.

    Science.gov (United States)

    Lucero, Robert J; Bakken, Suzanne

    2013-03-01

    Electronic health information systems can increase the ability of health-care organizations to investigate the effects of clinical interventions. The authors present an organizing framework that integrates outcomes and informatics research paradigms to guide knowledge discovery in electronic clinical databases. They illustrate its application using the example of hospital acquired pressure ulcers (HAPU). The Knowledge Discovery through Informatics for Comparative Effectiveness Research (KDI-CER) framework was conceived as a heuristic to conceptualize study designs and address potential methodological limitations imposed by using a single research perspective. Advances in informatics research can play a complementary role in advancing the field of outcomes research including CER. The KDI-CER framework can be used to facilitate knowledge discovery from routinely collected electronic clinical data.

  20. Common minor histocompatibility antigen discovery based upon patient clinical outcomes and genomic data.

    Directory of Open Access Journals (Sweden)

    Paul M Armistead

    Full Text Available BACKGROUND: Minor histocompatibility antigens (mHA mediate much of the graft vs. leukemia (GvL effect and graft vs. host disease (GvHD in patients who undergo allogeneic stem cell transplantation (SCT. Therapeutic decision making and treatments based upon mHAs will require the evaluation of multiple candidate mHAs and the selection of those with the potential to have the greatest impact on clinical outcomes. We hypothesized that common, immunodominant mHAs, which are presented by HLA-A, B, and C molecules, can mediate clinically significant GvL and/or GvHD, and that these mHAs can be identified through association of genomic data with clinical outcomes. METHODOLOGY/PRINCIPAL FINDINGS: Because most mHAs result from donor/recipient cSNP disparities, we genotyped 57 myeloid leukemia patients and their donors at 13,917 cSNPs. We correlated the frequency of genetically predicted mHA disparities with clinical evidence of an immune response and then computationally screened all peptides mapping to the highly associated cSNPs for their ability to bind to HLA molecules. As proof-of-concept, we analyzed one predicted antigen, T4A, whose mHA mismatch trended towards improved overall and disease free survival in our cohort. T4A mHA mismatches occurred at the maximum theoretical frequency for any given SCT. T4A-specific CD8+ T lymphocytes (CTLs were detected in 3 of 4 evaluable post-transplant patients predicted to have a T4A mismatch. CONCLUSIONS/SIGNIFICANCE: Our method is the first to combine clinical outcomes data with genomics and bioinformatics methods to predict and confirm a mHA. Refinement of this method should enable the discovery of clinically relevant mHAs in the majority of transplant patients and possibly lead to novel immunotherapeutics.

  1. Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

    Directory of Open Access Journals (Sweden)

    Sean Ekins

    Full Text Available Integrated computational approaches for Mycobacterium tuberculosis (Mtb are useful to identify new molecules that could lead to future tuberculosis (TB drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively. These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10-6 cm/s, kinetic solubility (125 μM at pH 7.4 in PBS and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes. Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice.

  2. Genome-Based Studies of Marine Microorganisms to Maximize the Diversity of Natural Products Discovery for Medical Treatments

    Directory of Open Access Journals (Sweden)

    Xin-Qing Zhao

    2011-01-01

    Full Text Available Marine microorganisms are rich source for natural products which play important roles in pharmaceutical industry. Over the past decade, genome-based studies of marine microorganisms have unveiled the tremendous diversity of the producers of natural products and also contributed to the efficiency of harness the strain diversity and chemical diversity, as well as the genetic diversity of marine microorganisms for the rapid discovery and generation of new natural products. In the meantime, genomic information retrieved from marine symbiotic microorganisms can also be employed for the discovery of new medical molecules from yet-unculturable microorganisms. In this paper, the recent progress in the genomic research of marine microorganisms is reviewed; new tools of genome mining as well as the advance in the activation of orphan pathways and metagenomic studies are summarized. Genome-based research of marine microorganisms will maximize the biodiscovery process and solve the problems of supply and sustainability of drug molecules for medical treatments.

  3. A wavelet-based approach to the discovery of themes and sections in monophonic melodies

    DEFF Research Database (Denmark)

    Velarde, Gissel; Meredith, David

    We present the computational method submitted to the MIREX 2014 Discovery of Repeated Themes & Sections task, and the results on the monophonic version of the JKU Patterns Development Database. In the context of pattern discovery in monophonic music, the idea behind our method is that, with a good...... melodic structure in terms of segments, it should be possible to gather similar segments into clusters and rank their salience within the piece. We present an approach to this problem and how we address it. In general terms, we represent melodies either as raw 1D pitch signals or as these signals filtered...

  4. Optimizing Neighbor Discovery for Ad hoc Networks based on the Bluetooth PAN Profile

    DEFF Research Database (Denmark)

    Kuijpers, Gerben; Nielsen, Thomas Toftegaard; Prasad, Ramjee

    2002-01-01

    IP layer neighbor discovery mechanisms rely highly on broadcast/multicast capabilities of the underlying link layer. The Bluetooth personal area network (PAN) profile has no native link layer broadcast/multicast capabilities and can only emulate this by repeatedly unicast link layer frames....... This paper introduces a neighbor discovery mechanism that utilizes the resources in the Bluetooth PAN profile more efficient. The performance of the new mechanism is investigated using a IPv6 network simulator and compared with emulated broadcasting. It is shown that the signaling overhead can...... be significantly reduced at the cost of a slight increase of processing complexity at the Bluetooth master device....

  5. Milp-hyperbox classification for structure-based drug design in the discovery of small molecule inhibitors of SIRTUIN6

    OpenAIRE

    Tardu, Mehmet; Rahim, Fatih; Kavaklı, İbrahim Halil; Türkay, Metin

    2016-01-01

    Virtual screening of chemical libraries following experimental assays of drug candidates is a common procedure in structure-based drug discovery. However, virtual screening of chemical libraries with millions of compounds requires a lot of time for computing and data analysis. A priori classification of compounds in the libraries as low-and high-binding free energy sets decreases the number of compounds for virtual screening experiments. This classification also reduces the required computati...

  6. Recent progresses in gene delivery-based bone tissue engineering.

    Science.gov (United States)

    Lu, Chia-Hsin; Chang, Yu-Han; Lin, Shih-Yeh; Li, Kuei-Chang; Hu, Yu-Chen

    2013-12-01

    Gene therapy has converged with bone engineering over the past decade, by which a variety of therapeutic genes have been delivered to stimulate bone repair. These genes can be administered via in vivo or ex vivo approach using either viral or nonviral vectors. This article reviews the fundamental aspects and recent progresses in the gene therapy-based bone engineering, with emphasis on the new genes, viral vectors and gene delivery approaches.

  7. Automated discovery of tissue-targeting enhancers and transcription factors from binding motif and gene function data.

    Directory of Open Access Journals (Sweden)

    Geetu Tuteja

    2014-01-01

    Full Text Available Identifying enhancers regulating gene expression remains an important and challenging task. While recent sequencing-based methods provide epigenomic characteristics that correlate well with enhancer activity, it remains onerous to comprehensively identify all enhancers across development. Here we introduce a computational framework to identify tissue-specific enhancers evolving under purifying selection. First, we incorporate high-confidence binding site predictions with target gene functional enrichment analysis to identify transcription factors (TFs likely functioning in a particular context. We then search the genome for clusters of binding sites for these TFs, overcoming previous constraints associated with biased manual curation of TFs or enhancers. Applying our method to the placenta, we find 33 known and implicate 17 novel TFs in placental function, and discover 2,216 putative placenta enhancers. Using luciferase reporter assays, 31/36 (86% tested candidates drive activity in placental cells. Our predictions agree well with recent epigenomic data in human and mouse, yet over half our loci, including 7/8 (87% tested regions, are novel. Finally, we establish that our method is generalizable by applying it to 5 additional tissues: heart, pancreas, blood vessel, bone marrow, and liver.

  8. Endophytes : exploiting biodiversity for the improvement of natural product-based drug discovery

    NARCIS (Netherlands)

    Staniek, Agata; Woerdenbag, Herman J.; Kayser, Oliver

    2008-01-01

    Endophytes, microorganisms that colonize internal tissues of all plant species, create a huge biodiversity with yet unknown novel natural products, presumed to push forward the frontiers of drug discovery. Next to the clinically acknowledged antineoplastic agent, paclitaxel, endophyte research has y

  9. Towards a goal-based service framework for dynamic service discovery and composition

    NARCIS (Netherlands)

    Bonino da Silva Santos, Luiz Olavo; Silva, Eduardo Goncalves; Ferreira Pires, Luis; Sinderen, van Marten

    2009-01-01

    Service-Oriented Computing allows new applications to be developed by using and/or combining services offered by different providers. Service discovery and composition are performed aiming to comply with the client’s request in terms of functionality and expected outcome. In this paper we present a

  10. The discovery of new isocyanide-based multi-component reactions

    NARCIS (Netherlands)

    Dömling, Alexander

    2000-01-01

    Multi-component reactions are finding increasing use in the discovery process of new drugs and agrochemicals. Some years ago they were considered as highly exotic types of organic reactions. Recently, many groups have realized that the field of multi-component reactions is full of new opportunities.

  11. Ontology-Based Context-Aware Service Discovery for Pervasive Environments

    NARCIS (Netherlands)

    Pawar, P.; Tokmakoff, A.

    2006-01-01

    Existing service discovery protocols use a service matching process in order to offer services of interest to the clients. Potentially, the context information of the services and client can be used to improve the quality of service matching. To make use of context information in service matching, s

  12. Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds

    Science.gov (United States)

    Marrero-Ponce, Yovani; Meneses-Marcel, Alfredo; Rivera-Borroto, Oscar M.; García-Domenech, Ramón; De Julián-Ortiz, Jesus Vicente; Montero, Alina; Escario, José Antonio; Barrio, Alicia Gómez; Pereira, David Montero; Nogal, Juan José; Grau, Ricardo; Torrens, Francisco; Vogel, Christian; Arán, Vicente J.

    2008-08-01

    Trichomonas vaginalis ( Tv) is the causative agent of the most common, non-viral, sexually transmitted disease in women and men worldwide. Since 1959, metronidazole (MTZ) has been the drug of choice in the systemic treatment of trichomoniasis. However, resistance to MTZ in some patients and the great cost associated with the development of new trichomonacidals make necessary the development of computational methods that shorten the drug discovery pipeline. Toward this end, bond-based linear indices, new TOMOCOMD-CARDD molecular descriptors, and linear discriminant analysis were used to discover novel trichomonacidal chemicals. The obtained models, using non-stochastic and stochastic indices, are able to classify correctly 89.01% (87.50%) and 82.42% (84.38%) of the chemicals in the training (test) sets, respectively. These results validate the models for their use in the ligand-based virtual screening. In addition, they show large Matthews' correlation coefficients ( C) of 0.78 (0.71) and 0.65 (0.65) for the training (test) sets, correspondingly. The result of predictions on the 10% full-out cross-validation test also evidences the robustness of the obtained models. Later, both models are applied to the virtual screening of 12 compounds already proved against Tv. As a result, they correctly classify 10 out of 12 (83.33%) and 9 out of 12 (75.00%) of the chemicals, respectively; which is the most important criterion for validating the models. Besides, these classification functions are applied to a library of seven chemicals in order to find novel antitrichomonal agents. These compounds are synthesized and tested for in vitro activity against Tv. As a result, experimental observations approached to theoretical predictions, since it was obtained a correct classification of 85.71% (6 out of 7) of the chemicals. Moreover, out of the seven compounds that are screened, synthesized and biologically assayed, six compounds (VA7-34, VA7-35, VA7-37, VA7-38, VA7-68, VA7-70) show

  13. A two-genome microarray for the rice pathogens Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola and its use in the discovery of a difference in their regulation of hrp genes

    Directory of Open Access Journals (Sweden)

    Lin Ye

    2008-06-01

    Full Text Available Abstract Background Xanthomonas oryzae pv. oryzae (Xoo and X. oryzae pv. oryzicola (Xoc are bacterial pathogens of the worldwide staple and grass model, rice. Xoo and Xoc are closely related but Xoo invades rice vascular tissue to cause bacterial leaf blight, a serious disease of rice in many parts of the world, and Xoc colonizes the mesophyll parenchyma to cause bacterial leaf streak, a disease of emerging importance. Both pathogens depend on hrp genes for type III secretion to infect their host. We constructed a 50–70 mer oligonucleotide microarray based on available genome data for Xoo and Xoc and compared gene expression in Xoo strains PXO99A and Xoc strain BLS256 grown in the rich medium PSB vs. XOM2, a minimal medium previously reported to induce hrp genes in Xoo strain T7174. Results Three biological replicates of the microarray experiment to compare global gene expression in representative strains of Xoo and Xoc grown in PSB vs. XOM2 were carried out. The non-specific error rate and the correlation coefficients across biological replicates and among duplicate spots revealed that the microarray data were robust. 247 genes of Xoo and 39 genes of Xoc were differentially expressed in the two media with a false discovery rate of 5% and with a minimum fold-change of 1.75. Semi-quantitative-RT-PCR assays confirmed differential expression of each of 16 genes each for Xoo and Xoc selected for validation. The differentially expressed genes represent 17 functional categories. Conclusion We describe here the construction and validation of a two-genome microarray for the two pathovars of X. oryzae. Microarray analysis revealed that using representative strains, a greater number of Xoo genes than Xoc genes are differentially expressed in XOM2 relative to PSB, and that these include hrp genes and other genes important in interactions with rice. An exception was the rax genes, which are required for production of the host resistance elicitor AvrXa21

  14. Design Process Optimization Based on Design Process Gene Mapping

    Institute of Scientific and Technical Information of China (English)

    LI Bo; TONG Shu-rong

    2011-01-01

    The idea of genetic engineering is introduced into the area of product design to improve the design efficiency. A method towards design process optimization based on the design process gene is proposed through analyzing the correlation between the design process gene and characteristics of the design process. The concept of the design process gene is analyzed and categorized into five categories that are the task specification gene, the concept design gene, the overall design gene, the detailed design gene and the processing design gene in the light of five design phases. The elements and their interactions involved in each kind of design process gene signprocess gene mapping is drawn with its structure disclosed based on its function that process gene.

  15. Identifying Liver Cancer and Its Relations with Diseases, Drugs, and Genes: A Literature-Based Approach

    Science.gov (United States)

    Song, Min

    2016-01-01

    In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications. PMID:27195695

  16. The Analysis of Multiple Genome Comparisons in Genus Escherichia and Its Application to the Discovery of Uncharacterised Metabolic Genes in Uropathogenic Escherichia coli CFT073

    Directory of Open Access Journals (Sweden)

    William A. Bryant

    2009-01-01

    Full Text Available A survey of a complete gene synteny comparison has been carried out between twenty fully sequenced strains from the genus Escherichia with the aim of finding yet uncharacterised genes implicated in the metabolism of uropathogenic strains of E. coli (UPEC. Several sets of adjacent colinear genes have been identified which are present in all four UPEC included in this study (CFT073, F11, UTI89, and 536, annotated with putative metabolic functions, but are not found in any other strains considered. An operon closely homologous to that encoding the L-sorbose degradation pathway in Klebsiella pneumoniae has been identified in E. coli CFT073; this operon is present in all of the UPEC considered, but only in 7 of the other 16 strains. The operon's function has been confirmed by cloning the genes into E. coli DH5α and testing for growth on L-sorbose. The functional genomic approach combining in silico and in vitro work presented here can be used as a basis for the discovery of other uncharacterised genes contributing to bacterial survival in specific environments.

  17. False-Positive Rate Determination of Protein Target Discovery using a Covalent Modification- and Mass Spectrometry-Based Proteomics Platform

    Science.gov (United States)

    Strickland, Erin C.; Geer, M. Ariel; Hong, Jiyong; Fitzgerald, Michael C.

    2014-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. Stability of proteins from rates of oxidation (SPROX) is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false-positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False-positive rates of 1.2-2.2 % and manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false-positive rate of protein target discovery using SPROX is also discussed.

  18. False Positive Rate Determination of Protein Target Discovery using a Covalent Modification- and Mass Spectrometry-Based Proteomics Platform

    Science.gov (United States)

    Strickland, Erin C.; Geer, M. Ariel; Hong, Jiyong; Fitzgerald, Michael C.

    2013-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. The Stability of Proteins from Rates of Oxidation (SPROX) technique is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False positive rates of 1.2–2.2% and manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false positive rate of protein target discovery using SPROX is also discussed. PMID:24114261

  19. Knowledge discovery about quality of life changes of spinal cord injury patients: clustering based on rules by states.

    Science.gov (United States)

    Gibert, Karina; García-Rudolph, Alejandro; Curcoll, Lluïsa; Soler, Dolors; Pla, Laura; Tormos, José María

    2009-01-01

    In this paper, an integral Knowledge Discovery Methodology, named Clustering based on rules by States, which incorporates artificial intelligence (AI) and statistical methods as well as interpretation-oriented tools, is used for extracting knowledge patterns about the evolution over time of the Quality of Life (QoL) of patients with Spinal Cord Injury. The methodology incorporates the interaction with experts as a crucial element with the clustering methodology to guarantee usefulness of the results. Four typical patterns are discovered by taking into account prior expert knowledge. Several hypotheses are elaborated about the reasons for psychological distress or decreases in QoL of patients over time. The knowledge discovery from data (KDD) approach turns out, once again, to be a suitable formal framework for handling multidimensional complexity of the health domains.

  20. Function-Based Metagenomic Library Screening and Heterologous Expression Strategy for Genes Encoding Phosphatase Activity.

    Science.gov (United States)

    Villamizar, Genis A Castillo; Nacke, Heiko; Daniel, Rolf

    2017-01-01

    The release of phosphate from inorganic and organic phosphorus compounds can be mediated enzymatically. Phosphate-releasing enzymes, comprising acid and alkaline phosphatases, are recognized as useful biocatalysts in applications such as plant and animal nutrition, bioremediation and diagnostic analysis. Metagenomic approaches provide access to novel phosphatase-encoding genes. Here, we describe a function-based screening approach for rapid identification of genes conferring phosphatase activity from small-insert and large-insert metagenomic libraries derived from various environments. This approach bears the potential for discovery of entirely novel phosphatase families or subfamilies and members of known enzyme classes hydrolyzing phosphomonoester bonds such as phytases. In addition, we provide a strategy for efficient heterologous phosphatase gene expression.

  1. Rapid countermeasure discovery against Francisella tularensis based on a metabolic network reconstruction.

    Directory of Open Access Journals (Sweden)

    Sidhartha Chaudhury

    Full Text Available In the future, we may be faced with the need to provide treatment for an emergent biological threat against which existing vaccines and drugs have limited efficacy or availability. To prepare for this eventuality, our objective was to use a metabolic network-based approach to rapidly identify potential drug targets and prospectively screen and validate novel small-molecule antimicrobials. Our target organism was the fully virulent Francisella tularensis subspecies tularensis Schu S4 strain, a highly infectious intracellular pathogen that is the causative agent of tularemia and is classified as a category A biological agent by the Centers for Disease Control and Prevention. We proceeded with a staggered computational and experimental workflow that used a strain-specific metabolic network model, homology modeling and X-ray crystallography of protein targets, and ligand- and structure-based drug design. Selected compounds were subsequently filtered based on physiological-based pharmacokinetic modeling, and we selected a final set of 40 compounds for experimental validation of antimicrobial activity. We began screening these compounds in whole bacterial cell-based assays in biosafety level 3 facilities in the 20th week of the study and completed the screens within 12 weeks. Six compounds showed significant growth inhibition of F. tularensis, and we determined their respective minimum inhibitory concentrations and mammalian cell cytotoxicities. The most promising compound had a low molecular weight, was non-toxic, and abolished bacterial growth at 13 µM, with putative activity against pantetheine-phosphate adenylyltransferase, an enzyme involved in the biosynthesis of coenzyme A, encoded by gene coaD. The novel antimicrobial compounds identified in this study serve as starting points for lead optimization, animal testing, and drug development against tularemia. Our integrated in silico/in vitro approach had an overall 15% success rate in terms of

  2. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    Science.gov (United States)

    Toxicogenomics, drug discovery, and pathologist.The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  3. A unified view of Automata-based algorithms for Frequent Episode Discovery

    CERN Document Server

    Achar, Avinash; Sastry, P S

    2010-01-01

    Frequent Episode Discovery framework is a popular framework in Temporal Data Mining with many applications. Over the years many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper we present a unified view of all such frequency counting algorithms. We present a generic algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various algorithms as we show here. We also point out how this unified view helps us to consider generalization of the algorithm so that they can discover episodes with general partial orders.

  4. Intact-protein analysis system for discovery of serum-based disease biomarkers.

    Science.gov (United States)

    Wang, Hong; Hanash, Samir

    2011-01-01

    Profiling of serum and plasma proteins has substantial relevance to the discovery of circulating disease biomarkers. However, the extreme complexity and vast dynamic range of protein abundance in serum and plasma present a formidable challenge for protein analysis. Thus, integration of multiple technologies is required to achieve high-resolution and high-sensitivity proteomic analysis of serum or plasma. In this chapter, we describe an orthogonal multidimensional intact-protein analysis system (IPAS) (Wang et al., Mol Cell Proteomics 4:618-625, 2005) coupled with protein tagging (Faca et al., J Proteome Res 5:2009-2018, 2006) to profile the serum and plasma proteomes quantitatively, which we have applied in our biomarker discovery studies (Katayama et al., Genome Med 1:47, 2009; Faca et al., PLoS Med 5:e123, 2008; Zhang et al. Genome Biol 9:R93, 2008).

  5. Progress in Chimeric Vector and Chimeric Gene Based Cardiovascular Gene Therapy

    Institute of Scientific and Technical Information of China (English)

    HU Chun-Song; YOON Young-sup; ISNER Jeffrey M.; LOSORDO Douglas W.

    2003-01-01

    Gene therapy for cardiovascular diseases has developed from preliminary animal experiments to clinical trials. However, vectors and target genes used currently in gene therapy are mainly focused on viral, nonviral vector and single target gene or monogene. Each vector system has a series of advantages and limitations. Chimeric vectors which combine the advantages of viral and nonviral vector,chimeric target genes which combine two or more target genes and novel gene delivery modes are being developed. In this article, we summarized the progress in chimeric vectors and chimeric genes based cardiovascular gene therapy, which including proliferative or occlusive vascular diseases such as atheroslerosis and restenosis, hypertonic vascular disease such as hypertension and cardiac diseases such as myocardium ischemia, dilated cardiomyopathy and heart failure, even heart transplantation. The development of chimeric vector, chimeric gene and their cardiovascular gene therapy is promising.

  6. Statistical design for biospecimen cohort size in proteomics-based biomarker discovery and verification studies.

    Science.gov (United States)

    Skates, Steven J; Gillette, Michael A; LaBaer, Joshua; Carr, Steven A; Anderson, Leigh; Liebler, Daniel C; Ransohoff, David; Rifai, Nader; Kondratovich, Marina; Težak, Živana; Mansfield, Elizabeth; Oberg, Ann L; Wright, Ian; Barnes, Grady; Gail, Mitchell; Mesri, Mehdi; Kinsinger, Christopher R; Rodriguez, Henry; Boja, Emily S

    2013-12-01

    Protein biomarkers are needed to deepen our understanding of cancer biology and to improve our ability to diagnose, monitor, and treat cancers. Important analytical and clinical hurdles must be overcome to allow the most promising protein biomarker candidates to advance into clinical validation studies. Although contemporary proteomics technologies support the measurement of large numbers of proteins in individual clinical specimens, sample throughput remains comparatively low. This problem is amplified in typical clinical proteomics research studies, which routinely suffer from a lack of proper experimental design, resulting in analysis of too few biospecimens to achieve adequate statistical power at each stage of a biomarker pipeline. To address this critical shortcoming, a joint workshop was held by the National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI), and American Association for Clinical Chemistry (AACC) with participation from the U.S. Food and Drug Administration (FDA). An important output from the workshop was a statistical framework for the design of biomarker discovery and verification studies. Herein, we describe the use of quantitative clinical judgments to set statistical criteria for clinical relevance and the development of an approach to calculate biospecimen sample size for proteomic studies in discovery and verification stages prior to clinical validation stage. This represents a first step toward building a consensus on quantitative criteria for statistical design of proteomics biomarker discovery and verification research.

  7. Aptamer-based detection of disease biomarkers in mouse models for chagas drug discovery.

    Directory of Open Access Journals (Sweden)

    Fernanda Fortes de Araujo

    2015-01-01

    Full Text Available Drug discovery initiatives, aimed at Chagas treatment, have been hampered by the lack of standardized drug screening protocols and the absence of simple pre-clinical assays to evaluate treatment efficacy in animal models. In this study, we used a simple Enzyme Linked Aptamer (ELA assay to detect T. cruzi biomarker in blood and validate murine drug discovery models of Chagas disease. In two mice models, Apt-29 ELA assay demonstrated that biomarker levels were significantly higher in the infected group compared to the control group, and upon Benznidazole treatment, their levels reduced. However, biomarker levels in the infected treated group did not reduce to those seen in the non-infected treated group, with 100% of the mice above the assay cutoff, suggesting that parasitemia was reduced but cure was not achieved. The ELA assay was capable of detecting circulating biomarkers in mice infected with various strains of T. cruzi parasites. Our results showed that the ELA assay could detect residual parasitemia in treated mice by providing an overall picture of the infection in the host. They suggest that the ELA assay can be used in drug discovery applications to assess treatment efficacy in-vivo.

  8. Aptamer-based detection of disease biomarkers in mouse models for chagas drug discovery.

    Science.gov (United States)

    de Araujo, Fernanda Fortes; Nagarkatti, Rana; Gupta, Charu; Marino, Ana Paula; Debrabant, Alain

    2015-01-01

    Drug discovery initiatives, aimed at Chagas treatment, have been hampered by the lack of standardized drug screening protocols and the absence of simple pre-clinical assays to evaluate treatment efficacy in animal models. In this study, we used a simple Enzyme Linked Aptamer (ELA) assay to detect T. cruzi biomarker in blood and validate murine drug discovery models of Chagas disease. In two mice models, Apt-29 ELA assay demonstrated that biomarker levels were significantly higher in the infected group compared to the control group, and upon Benznidazole treatment, their levels reduced. However, biomarker levels in the infected treated group did not reduce to those seen in the non-infected treated group, with 100% of the mice above the assay cutoff, suggesting that parasitemia was reduced but cure was not achieved. The ELA assay was capable of detecting circulating biomarkers in mice infected with various strains of T. cruzi parasites. Our results showed that the ELA assay could detect residual parasitemia in treated mice by providing an overall picture of the infection in the host. They suggest that the ELA assay can be used in drug discovery applications to assess treatment efficacy in-vivo.

  9. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Institute of Scientific and Technical Information of China (English)

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  10. Discovery and evaluation of candidate sex-determining genes and xenobiotics in the gonads of lake sturgeon (Acipenser fulvescens).

    Science.gov (United States)

    Hale, Matthew C; Jackson, James R; Dewoody, J Andrew

    2010-07-01

    Modern pyrosequencing has the potential to uncover many interesting aspects of genome evolution, even in lineages where genomic resources are scarce. In particular, 454 pyrosequencing of nonmodel species has been used to characterize expressed sequence tags, xenobiotics, gene ontologies, and relative levels of gene expression. Herein, we use pyrosequencing to study the evolution of genes expressed in the gonads of a polyploid fish, the lake sturgeon (Acipenser fulvescens). Using 454 pyrosequencing of transcribed genes, we produced more than 125 MB of sequence data from 473,577 high-quality sequencing reads. Sequences that passed stringent quality control thresholds were assembled into 12,791 male contigs and 32,629 female contigs. Average depth of coverage was 4.2 x for the male assembly and 5.5x for the female assembly. Analytical rarefaction indicates that our assemblies include most of the genes expressed in lake sturgeon gonads. Over 86,700 sequencing reads were assigned gene ontologies, many to general housekeeping genes like protein, RNA, and ion binding genes. We searched specifically for sex determining genes and documented significant sex differences in the expression of two genes involved in animal sex determination, DMRT1 and TRA-1. DMRT1 is the master sex determining gene in birds and in medaka (Oryzias latipes) whereas TRA-1 helps direct sexual differentiation in nematodes. We also searched the lake sturgeon assembly for evidence of xenobiotic organisms that may exist as endosymbionts. Our results suggest that exogenous parasites (trematodes) and pathogens (protozoans) apparently have infected lake sturgeon gonads, and the trematodes have horizontally transferred some genes to the lake sturgeon genome.

  11. MICROBLOG-BASED THEME DISCOVERY%基于微博的主题社区发现

    Institute of Scientific and Technical Information of China (English)

    何翔; 顾春华; 丁军

    2013-01-01

    为了满足微博营销寻找投放目标的需求,提出结合面向内容及连接关系分析的微博主题社区发现方法.创造性地加入了领袖发现、文本分类以及最大流社区发现的链接分析技术,同时采用多种剪枝策略,设计出一个高效准确的微博主题爬虫.实验经过真实数据的采集,并且从不同的维度对结果数据进行了实验分析.%In order to meet the demand of microblog marketing in hunting the delivery target,we propose a discovery method of microblogging theme community which combines the content-oriented and linking relationship-based analytical methods.In the paper,the link analysis technologies of authority discovery,text classification and max-flow community discovery are creatively added,multiple pruning strategies are employed simultaneously as well,we design a quite effective and precise microblogging theme crawler.Our experiments are passed with the collection of real data,and the result data are made experimental analysis from different dimensions.

  12. Discovery potential of xenon-based neutrinoless double beta decay experiments in light of small angular scale CMB observations

    CERN Document Server

    Gomez-Cadenas, J J; Vidal, J Muñoz; Peña-Garay, C

    2013-01-01

    The South Pole Telescope (SPT) has probed an expanded angular range of the CMB temperature power spectrum. Their recent analysis of the latest cosmological data prefers nonzero neutrino masses, mnu = 0.32+-0.11 eV. This result, if confirmed by the upcoming Planck data, has deep implications on the discovery of the nature of neutrinos. In particular, the values of the effective neutrino mass involved in neutrinoless double beta decay (bb0nu) are severely constrained for both the direct and inverse hierarchy, making a discovery much more likely. In this paper, we focus in xenon-based bb0nu experiments, on the double grounds of their good performance and the suitability of the technology to large-mass scaling. We show that the current generation, with effective masses in the range of 100 kg and conceivable exposures in the range of 500 kg year, could already have a sizable opportunity to observe bb0nu events, and their combined discovery potential is quite large. The next generation, with an exposure in the rang...

  13. HANDS: a tool for genome-wide discovery of subgenome-specific base-identity in polyploids.

    KAUST Repository

    Mithani, Aziz

    2013-09-24

    The analysis of polyploid genomes is problematic because homeologous subgenome sequences are closely related. This relatedness makes it difficult to assign individual sequences to the specific subgenome from which they are derived, and hinders the development of polyploid whole genome assemblies.We here present a next-generation sequencing (NGS)-based approach for assignment of subgenome-specific base-identity at sites containing homeolog-specific polymorphisms (HSPs): \\'HSP base Assignment using NGS data through Diploid Similarity\\' (HANDS). We show that HANDS correctly predicts subgenome-specific base-identity at >90% of assayed HSPs in the hexaploid bread wheat (Triticum aestivum) transcriptome, thus providing a substantial increase in accuracy versus previous methods for homeolog-specific base assignment.We conclude that HANDS enables rapid and accurate genome-wide discovery of homeolog-specific base-identity, a capability having multiple applications in polyploid genomics.

  14. Recent progress in polymer-based gene delivery vectors

    Institute of Scientific and Technical Information of China (English)

    HUANG Shiwen; ZHUO Renxi

    2003-01-01

    The gene delivery system is one of the three components of a gene medicine, which is the bottle neck of current gene therapy. Nonviral vectors offer advantages over the viral system of safety, ease of manufacturing, etc. As important nonviral vectors, polymer gene delivery systems have gained increasing attention and have begun to show increasing promising. In this review, the fundamental and recent progress of polymer-based gene delivery vectors is reviewed.

  15. Evidence based selection of housekeeping genes.

    Directory of Open Access Journals (Sweden)

    Hendrik J M de Jonge

    Full Text Available For accurate and reliable gene expression analysis, normalization of gene expression data against housekeeping genes (reference or internal control genes is required. It is known that commonly used housekeeping genes (e.g. ACTB, GAPDH, HPRT1, and B2M vary considerably under different experimental conditions and therefore their use for normalization is limited. We performed a meta-analysis of 13,629 human gene array samples in order to identify the most stable expressed genes. Here we show novel candidate housekeeping genes (e.g. RPS13, RPL27, RPS20 and OAZ1 with enhanced stability among a multitude of different cell types and varying experimental conditions. None of the commonly used housekeeping genes were present in the top 50 of the most stable expressed genes. In addition, using 2,543 diverse mouse gene array samples we were able to confirm the enhanced stability of the candidate novel housekeeping genes in another mammalian species. Therefore, the identified novel candidate housekeeping genes seem to be the most appropriate choice for normalizing gene expression data.

  16. Discovery potential of xenon-based neutrinoless double beta decay experiments in light of small angular scale CMB observations

    Energy Technology Data Exchange (ETDEWEB)

    Gómez-Cadenas, J.J.; Martín-Albo, J.; Vidal, J. Muñoz; Peña-Garay, C., E-mail: gomez@mail.cern.ch, E-mail: jmalbos@ific.uv.es, E-mail: jmunoz@ific.uv.es, E-mail: penya@ific.uv.es [Instituto de Física Corpuscular (IFIC), CSIC and Universitat de Valencia Calle Catedrático José Beltrán, 2, 46090 Paterna, Valencia (Spain)

    2013-03-01

    The South Pole Telescope (SPT) has probed an expanded angular range of the CMB temperature power spectrum. Their recent analysis of the latest cosmological data prefers nonzero neutrino masses, with Σm{sub ν} = (0.32±0.11) eV. This result, if confirmed by the upcoming Planck data, has deep implications on the discovery of the nature of neutrinos. In particular, the values of the effective neutrino mass m{sub ββ} involved in neutrinoless double beta decay (ββ0ν) are severely constrained for both the direct and inverse hierarchy, making a discovery much more likely. In this paper, we focus in xenon-based ββ0ν experiments, on the double grounds of their good performance and the suitability of the technology to large-mass scaling. We show that the current generation, with effective masses in the range of 100 kg and conceivable exposures in the range of 500 kg·year, could already have a sizeable opportunity to observe ββ0ν events, and their combined discovery potential is quite large. The next generation, with an exposure in the range of 10 ton·year, would have a much more enhanced sensitivity, in particular due to the very low specific background that all the xenon technologies (liquid xenon, high-pressure xenon and xenon dissolved in liquid scintillator) can achieve. In addition, a high-pressure xenon gas TPC also features superb energy resolution. We show that such detector can fully explore the range of allowed effective Majorana masses, thus making a discovery very likely.

  17. Take-Home Challenges: Extending Discovery-Based Activities beyond the General Chemistry Classroom

    Science.gov (United States)

    Mason, P. K.; Sarquis, A. M.

    1996-04-01

    In an effort to more effectively integrate the experimental nature of chemistry into our students' experiences, we are developing and implementing discovery-based activities into both the laboratory and lecture components of general chemistry. Below we describe and provide an example of a "take-home challenge" intended to supplement the lecture component of the course. These take-home challenges involve the student in chemistry exploration outside of class and extend the context of content and experimentation into a nontraditional laboratory environment. Over 25 take-home challenges have been developed to date. Preliminary evaluation of the impact of the take-home challenges shows that students reporting themselves as receiving a B or C grade in the course find the challenges very useful in helping them gain a conceptual understanding of the phenomena addressed. Students earning an A grade report little or no impact on their learning. Prepared as one-page handouts, each take-home challenge begins with a scene-setting introduction followed by pertinent background information, a list of materials to be collected, and any appropriate safety precautions. The exploration component of the activity integrates leading questions with the procedural instructions to help guide the students through the discovery process and challenge them to stretch their understanding of the chemistry. After completing a take-home challenge activity, students submit written reports containing responses to the questions posed, observations of data collected, and their responses to the challenge. The accompanying sample take-home challenge activity is provided as a novel adaptation of the belch phenomenon that challenges students to experiment in order to explain the factors that account for the observed behavior. Persons interested in field testing the take-home challenges with their classes should contact the authors. Belch Bottle Challenge: What factors are responsible for the behavior of a

  18. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2017-01-01

    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. PMID:27903896

  19. New insight into genes in association with asthma: literature-based mining and network centrality analysis

    Institute of Scientific and Technical Information of China (English)

    LIANG Rui; WANG Lei; WANG Gang

    2013-01-01

    Background Asthma is a heterogeneous disease for which a strong genetic basis has been firmly established.Until now no studies have been undertaken to systemically explore the network of asthma-related genes using an internally developed literature-based discovery approach.This study was to explore asthma-related genes by using literaturebased mining and network centrality analysis.Methods Literature involving asthma-related genes were searched in PubMed from 2001 to 2011.Integration of natural language processing with network centrality analysis was used to identify asthma susceptibility genes and their interaction network.Asthma susceptibility genes were classified into three functional groups by gene ontology (GO) analysis and the key genes were confirmed by establishing asthma-related networks and pathways.Results Three hundred and twenty-six genes related with asthma such as IGHE (IgE),interleukin (IL)-4,5,6,10,13,17A,and tumor necrosis factor (TNF)-alpha were identified.GO analysis indicated some biological processes (developmental processes,signal transduction,death,etc.),cellular components (non-structural extracellular,plasma membrane and extracellular matrix),and molecular functions (signal transduction activity) that were involved in asthma.Furthermore,22 asthma-related pathways such as the Toll-like receptor signaling pathway,hematopoietic cell lineage,JAK-STAT signaling pathway,chemokine signaling pathway,and cytokine-cytokine receptor interaction,and 17 hub genes,such as JAK3,CCR1-3,CCR5-7,CCR8,were found.Conclusions Our study provides a remarkably detailed and comprehensive picture of asthma susceptibility genes and their interacting network.Further identification of these genes and molecular pathways may play a prominent role in establishing rational therapeutic approaches for asthma.

  20. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    Directory of Open Access Journals (Sweden)

    Victor Zeng

    Full Text Available Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects, representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket, a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in

  1. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    Energy Technology Data Exchange (ETDEWEB)

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  2. Coupled transcriptome and proteome analysis of human lymphotropic tumor viruses: insights on the detection and discovery of viral genes

    Directory of Open Access Journals (Sweden)

    Dresang Lindsay R

    2011-12-01

    Full Text Available Abstract Background Kaposi's sarcoma-associated herpesvirus (KSHV and Epstein-Barr virus (EBV are related human tumor viruses that cause primary effusion lymphomas (PEL and Burkitt's lymphomas (BL, respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  3. Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA to fine-map genes in polyploid wheat

    Directory of Open Access Journals (Sweden)

    Trick Martin

    2012-01-01

    Full Text Available Abstract Background Next generation sequencing (NGS technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC gene GPC-B1. Bulked segregant analysis (BSA was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes.

  4. Determinants of Power in Gene-Based Burden Testing for Monogenic Disorders.

    Science.gov (United States)

    Guo, Michael H; Dauber, Andrew; Lippincott, Margaret F; Chan, Yee-Ming; Salem, Rany M; Hirschhorn, Joel N

    2016-09-01

    Whole-exome sequencing has enabled new approaches for discovering genes associated with monogenic disorders. One such approach is gene-based burden testing, in which the aggregate frequency of "qualifying variants" is compared between case and control subjects for each gene. Despite substantial successes of this approach, the genetic causes for many monogenic disorders remain unknown or only partially known. It is possible that particular genetic architectures lower rates of discovery, but the influence of these factors on power has not been rigorously evaluated. Here, we leverage large-scale exome-sequencing data to create an empirically based simulation framework to evaluate the impact of key parameters (background variation rates, locus heterogeneity, mode of inheritance, penetrance) on power in gene-based burden tests in the context of monogenic disorders. Our results demonstrate that across genes, there is a wide range in sample sizes needed to achieve power due to differences in the background rate of rare variants in each gene. Increasing locus heterogeneity results in rapid increases in sample sizes needed to achieve adequate power, particularly when individual genes contribute to less than 5% of cases under a dominant model. Interestingly, incomplete penetrance as low as 10% had little effect on power due to the low prevalence of monogenic disorders. Our results suggest that moderate incomplete penetrance is not an obstacle in this gene-based burden testing approach but that dominant disorders with high locus heterogeneity will require large sample sizes. Our simulations also provide guidance on sample size needs and inform study design under various genetic architectures.

  5. Computing gene expression data with a knowledge-based gene clustering approach.

    Science.gov (United States)

    Rosa, Bruce A; Oh, Sookyung; Montgomery, Beronda L; Chen, Jin; Qin, Wensheng

    2010-01-01

    Computational analysis methods for gene expression data gathered in microarray experiments can be used to identify the functions of previously unstudied genes. While obtaining the expression data is not a difficult task, interpreting and extracting the information from the datasets is challenging. In this study, a knowledge-based approach which identifies and saves important functional genes before filtering based on variability and fold change differences was utilized to study light regulation. Two clustering methods were used to cluster the filtered datasets, and clusters containing a key light regulatory gene were located. The common genes to both of these clusters were identified, and the genes in the common cluster were ranked based on their coexpression to the key gene. This process was repeated for 11 key genes in 3 treatment combinations. The initial filtering method reduced the dataset size from 22,814 probes to an average of 1134 genes, and the resulting common cluster lists contained an average of only 14 genes. These common cluster lists scored higher gene enrichment scores than two individual clustering methods. In addition, the filtering method increased the proportion of light responsive genes in the dataset from 1.8% to 15.2%, and the cluster lists increased this proportion to 18.4%. The relatively short length of these common cluster lists compared to gene groups generated through typical clustering methods or coexpression networks narrows the search for novel functional genes while increasing the likelihood that they are biologically relevant.

  6. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Science.gov (United States)

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  7. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Rongning Liu

    Full Text Available In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina. 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  8. The Discovery of Quinoxaline-Based Metathesis Catalysts from Synthesis of Grazoprevir (MK-5172).

    Science.gov (United States)

    Williams, Michael J; Kong, Jongrock; Chung, Cheol K; Brunskill, Andrew; Campeau, Louis-Charles; McLaughlin, Mark

    2016-05-01

    Olefin metathesis (OM) is a reliable and practical synthetic methodology for challenging carbon-carbon bond formations. While existing catalysts can effect many of these transformations, the synthesis and development of new catalysts is essential to increase the application breadth of OM and to achieve improved catalyst activity. The unexpected initial discovery of a novel olefin metathesis catalyst derived from synthetic efforts toward the HCV therapeutic agent grazoprevir (MK-5172) is described. This initial finding has evolved into a class of tunable, shelf-stable ruthenium OM catalysts that are easily prepared and exhibit unique catalytic activity.

  9. Genome-based discovery, structure prediction and functional analysis of cyclic lipopeptide antibiotics in Pseudomonas species

    NARCIS (Netherlands)

    Bruijn, de I.; Kock, de M.J.D.; Meng, Y.; Waard, de P.; Beek, van T.A.; Raaijmakers, J.M.

    2007-01-01

    Analysis of microbial genome sequences have revealed numerous genes involved in antibiotic biosynthesis. In Pseudomonads, several gene clusters encoding non-ribosomal peptide synthetases (NRPSs) were predicted to be involved in the synthesis of cyclic lipopeptide (CLP) antibiotics. Most of these pre

  10. Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

    Science.gov (United States)

    Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

    2014-01-01

    Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific

  11. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  12. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Ju-Chun Hsu

    Full Text Available Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS. The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs. A total of 29,067 isotigs have putative homologues in the non-redundant (nr protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also

  13. Impact of computational structure-based predictive toxicology in drug discovery.

    Science.gov (United States)

    Mohan, Chethampadi Gopi

    2011-06-01

    Computational tools for predicting toxicity have been envisioned to have the potential to broadly impact up on the attrition rate of compounds in pre-clinical drug discovery and development. An integrated approach of computer-assisted, predictive, and physico-chemical properties of a compound, along with its in vitro and in vivo analysis, needs to be routinely exercised in the lead identification and lead optimization processes. Starting with a good lead can save a lot of money and it can significantly reduce the entire drug discovery process. The journey towards triple R's- reduce, replace and refine, further proves to be successful in predicting adverse drug reactions in patients (or animals) enrolled in clinical trials. However, the impact of predictive toxicity analysis was modest and relatively narrow in scope, due to the limited domain knowledge in this field. It is important to note that advances within medical science and newer approaches in drug development will require predictive toxicology applications to be viable. The field of computational toxicology has been heading in a direction more relevant to human diseases by reducing the adverse drug reactions. Therefore, efforts must be directed to integrating these tools relevant to the goal of preventing undesired toxicity in pre-clinical trials followed by different phases of clinical trials.

  14. A DHT-Based Discovery Service for the Internet of Things

    Directory of Open Access Journals (Sweden)

    Federica Paganelli

    2012-01-01

    Full Text Available Current trends towards the Future Internet are envisaging the conception of novel services endowed with context-aware and autonomic capabilities to improve end users’ quality of life. The Internet of Things paradigm is expected to contribute towards this ambitious vision by proposing models and mechanisms enabling the creation of networks of “smart things” on a large scale. It is widely recognized that efficient mechanisms for discovering available resources and capabilities are required to realize such vision. The contribution of this work consists in a novel discovery service for the Internet of Things. The proposed solution adopts a peer-to-peer approach for guaranteeing scalability, robustness, and easy maintenance of the overall system. While most existing peer-to-peer discovery services proposed for the IoT support solely exact match queries on a single attribute (i.e., the object identifier, our solution can handle multiattribute and range queries. We defined a layered approach by distinguishing three main aspects: multiattribute indexing, range query support, peer-to-peer routing. We chose to adopt an over-DHT indexing scheme to guarantee ease of design and implementation principles. We report on the implementation of a Proof of Concept in a dangerous goods monitoring scenario, and, finally, we discuss test results for structural properties and query performance evaluation.

  15. Biomarker discovery by proteomics-based approaches for early detection and personalized medicine in colorectal cancer.

    Science.gov (United States)

    Corbo, Claudia; Cevenini, Armando; Salvatore, Francesco

    2016-12-26

    About one million people per year develop colorectal cancer (CRC) and approximately half of them die. The extent of the disease (i.e. local invasion at the time of diagnosis) is a key prognostic factor. The 5-year survival rate is almost 90% in the case of delimited CRC and 10% in the case of metastasized CRC. Hence, one of the great challenges in the battle against CRC is to improve early diagnosis strategies. Large-scale proteomic approaches are widely used in cancer research to search for novel biomarkers. Such biomarkers can help in improving the accuracy of the diagnosis and in the optimization of personalized therapy. Herein, we provide an overview of studies published in the last 5 years on CRC that led to the identification of protein biomarkers suitable for clinical application by using proteomic approaches. We discussed these findings according to biomarker application, including also the role of protein phosphorylation and cancer stem cells in biomarker discovery. Our review provides a cross section of scientific approaches and can furnish suggestions for future experimental strategies to be used as reference by scientists, clinicians and researchers interested in proteomics for biomarker discovery.

  16. Accelerating Novel Candidate Gene Discovery in Neurogenetic Disorders via Whole-Exome Sequencing of Prescreened Multiplex Consanguineous Families

    Directory of Open Access Journals (Sweden)

    Anas M. Alazami

    2015-01-01

    Full Text Available Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS. We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  17. Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families.

    Science.gov (United States)

    Alazami, Anas M; Patel, Nisha; Shamseldin, Hanan E; Anazi, Shamsa; Al-Dosari, Mohammed S; Alzahrani, Fatema; Hijazi, Hadia; Alshammari, Muneera; Aldahmesh, Mohammed A; Salih, Mustafa A; Faqeih, Eissa; Alhashem, Amal; Bashiri, Fahad A; Al-Owain, Mohammed; Kentab, Amal Y; Sogaty, Sameera; Al Tala, Saeed; Temsah, Mohamad-Hani; Tulbah, Maha; Aljelaify, Rasha F; Alshahwan, Saad A; Seidahmed, Mohammed Zain; Alhadid, Adnan A; Aldhalaan, Hesham; AlQallaf, Fatema; Kurdi, Wesam; Alfadhel, Majid; Babay, Zainab; Alsogheer, Mohammad; Kaya, Namik; Al-Hassnan, Zuhair N; Abdel-Salam, Ghada M H; Al-Sannaa, Nouriya; Al Mutairi, Fuad; El Khashab, Heba Y; Bohlega, Saeed; Jia, Xiaofei; Nguyen, Henry C; Hammami, Rakad; Adly, Nouran; Mohamed, Jawahir Y; Abdulwahab, Firdous; Ibrahim, Niema; Naim, Ewa A; Al-Younes, Banan; Meyer, Brian F; Hashem, Mais; Shaheen, Ranad; Xiong, Yong; Abouelhoda, Mohamed; Aldeeri, Abdulrahman A; Monies, Dorota M; Alkuraya, Fowzan S

    2015-01-13

    Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS). We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  18. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes

    Science.gov (United States)

    Elso, Colleen M.; Chu, Edward P. F.; Alsayb, May A.; Mackin, Leanne; Ivory, Sean T.; Ashton, Michelle P.; Bröer, Stefan; Silveira, Pablo A.; Brodnicki, Thomas C.

    2015-01-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying “natural” alleles in the human population is to engineer “artificial” alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  19. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  20. GO-based Functional Dissimilarity of Gene Sets

    Directory of Open Access Journals (Sweden)

    Aguilar-Ruiz Jesús S

    2011-09-01

    Full Text Available Abstract Background The Gene Ontology (GO provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together. Results To implement this approach to functional assessment, we present GFD (GO-based Functional Dissimilarity, a novel dissimilarity measure for evaluating groups of genes based on the most relevant functions of the whole set. The measure assigns a numerical value to the gene set for each of the three GO sub-ontologies. Conclusions Results show that GFD performs robustly when applied to gene set of known functionality (extracted from KEGG. It performs particularly well on randomly generated gene sets. An ROC analysis reveals that the performance of GFD in evaluating the functional dissimilarity of gene sets is very satisfactory. A comparative analysis against other functional measures, such as GS2 and those presented by Resnik and Wang, also demonstrates the robustness of GFD.

  1. Climate Discovery: Integrating Research With Exhibit, Public Tours, K-12, and Web-based EPO Resources

    Science.gov (United States)

    Foster, S. Q.; Carbone, L.; Gardiner, L.; Johnson, R.; Russell, R.; Advisory Committee, S.; Ammann, C.; Lu, G.; Richmond, A.; Maute, A.; Haller, D.; Conery, C.; Bintner, G.

    2005-12-01

    The Climate Discovery Exhibit at the National Center for Atmospheric Research (NCAR) Mesa Lab provides an exciting conceptual outline for the integration of several EPO activities with other well-established NCAR educational resources and programs. The exhibit is organized into four topic areas intended to build understanding among NCAR's 80,000 annual visitors, including 10,000 school children, about Earth system processes and scientific methods contributing to a growing body of knowledge about climate and global change. These topics include: 'Sun-Earth Connections,' 'Climate Now,' 'Climate Past,' and 'Climate Future.' Exhibit text, graphics, film and electronic media, and interactives are developed and updated through collaborations between NCAR's climate research scientists and staff in the Office of Education and Outreach (EO) at the University Corporation for Atmospheric Research (UCAR). With funding from NCAR, paleoclimatologists have contributed data and ideas for a new exhibit Teachers' Guide unit about 'Climate Past.' This collection of middle-school level, standards-aligned lessons are intended to help students gain understanding about how scientists use proxy data and direct observations to describe past climates. Two NASA EPO's have funded the development of 'Sun-Earth Connection' lessons, visual media, and tips for scientists and teachers. Integrated with related content and activities from the NASA-funded Windows to the Universe web site, these products have been adapted to form a second unit in the Climate Discovery Teachers' Guide about the Sun's influence on Earth's climate. Other lesson plans, previously developed by on-going efforts of EO staff and NSF's previously-funded Project Learn program are providing content for a third Teachers' Guide unit on 'Climate Now' - the dynamic atmospheric and geological processes that regulate Earth's climate. EO has plans to collaborate with NCAR climatologists and computer modelers in the next year to develop

  2. Knowledge-Based, Central Nervous System (CNS) Lead Selection and Lead Optimization for CNS Drug Discovery.

    Science.gov (United States)

    Ghose, Arup K; Herbertz, Torsten; Hudkins, Robert L; Dorsey, Bruce D; Mallamo, John P

    2012-01-18

    The central nervous system (CNS) is the major area that is affected by aging. Alzheimer's disease (AD), Parkinson's disease (PD), brain cancer, and stroke are the CNS diseases that will cost trillions of dollars for their treatment. Achievement of appropriate blood-brain barrier (BBB) penetration is often considered a significant hurdle in the CNS drug discovery process. On the other hand, BBB penetration may be a liability for many of the non-CNS drug targets, and a clear understanding of the physicochemical and structural differences between CNS and non-CNS drugs may assist both research areas. Because of the numerous and challenging issues in CNS drug discovery and the low success rates, pharmaceutical companies are beginning to deprioritize their drug discovery efforts in the CNS arena. Prompted by these challenges and to aid in the design of high-quality, efficacious CNS compounds, we analyzed the physicochemical property and the chemical structural profiles of 317 CNS and 626 non-CNS oral drugs. The conclusions derived provide an ideal property profile for lead selection and the property modification strategy during the lead optimization process. A list of substructural units that may be useful for CNS drug design was also provided here. A classification tree was also developed to differentiate between CNS drugs and non-CNS oral drugs. The combined analysis provided the following guidelines for designing high-quality CNS drugs: (i) topological molecular polar surface area of <76 Å(2) (25-60 Å(2)), (ii) at least one (one or two, including one aliphatic amine) nitrogen, (iii) fewer than seven (two to four) linear chains outside of rings, (iv) fewer than three (zero or one) polar hydrogen atoms, (v) volume of 740-970 Å(3), (vi) solvent accessible surface area of 460-580 Å(2), and (vii) positive QikProp parameter CNS. The ranges within parentheses may be used during lead optimization. One violation to this proposed profile may be acceptable. The

  3. Novel definition files for human GeneChips based on GeneAnnot

    Directory of Open Access Journals (Sweden)

    Ferrari Sergio

    2007-11-01

    Full Text Available Abstract Background Improvements in genome sequence annotation revealed discrepancies in the original probeset/gene assignment in Affymetrix microarray and the existence of differences between annotations and effective alignments of probes and transcription products. In the current generation of Affymetrix human GeneChips, most probesets include probes matching transcripts from more than one gene and probes which do not match any transcribed sequence. Results We developed a novel set of custom Chip Definition Files (CDF and the corresponding Bioconductor libraries for Affymetrix human GeneChips, based on the information contained in the GeneAnnot database. GeneAnnot-based CDFs are composed of unique custom-probesets, including only probes matching a single gene. Conclusion GeneAnnot-based custom CDFs solve the problem of a reliable reconstruction of expression levels and eliminate the existence of more than one probeset per gene, which often leads to discordant expression signals for the same transcript when gene differential expression is the focus of the analysis. GeneAnnot CDFs are freely distributed and fully compliant with Affymetrix standards and all available software for gene expression analysis. The CDF libraries are available from http://www.xlab.unimo.it/GA_CDF, along with supplementary information (CDF libraries, installation guidelines and R code, CDF statistics, and analysis results.

  4. Discovery of novel small-molecule inhibitors of BRD4 using structure-based virtual screening.

    Science.gov (United States)

    Vidler, Lewis R; Filippakopoulos, Panagis; Fedorov, Oleg; Picaud, Sarah; Martin, Sarah; Tomsett, Michael; Woodward, Hannah; Brown, Nathan; Knapp, Stefan; Hoelder, Swen

    2013-10-24

    Bromodomains (BRDs) are epigenetic readers that recognize acetylated-lysine (KAc) on proteins and are implicated in a number of diseases. We describe a virtual screening approach to identify BRD inhibitors. Key elements of this approach are the extensive design and use of substructure queries to compile a set of commercially available compounds featuring novel putative KAc mimetics and docking this set for final compound selection. We describe the validation of this approach by applying it to the first BRD of BRD4. The selection and testing of 143 compounds lead to the discovery of six novel hits, including four unprecedented KAc mimetics. We solved the crystal structure of four hits, determined their binding mode, and improved their potency through synthesis and the purchase of derivatives. This work provides a validated virtual screening approach that is applicable to other BRDs and describes novel KAc mimetics that can be further explored to design more potent inhibitors.

  5. Comprehensive Phenotyping in Multiple Sclerosis: Discovery Based Proteomics and the Current Understanding of Putative Biomarkers

    Directory of Open Access Journals (Sweden)

    Kevin C. O’Connor

    2006-01-01

    Full Text Available Currently, there is no single test for multiple sclerosis (MS. Diagnosis is confirmed through clinical evaluation, abnormalities revealed by magnetic resonance imaging (MRI, and analysis of cerebrospinal fluid (CSF chemistry. The early and accurate diagnosis of the disease, monitoring of progression, and gauging of therapeutic intervention are important but elusive elements of patient care. Moreover, a deeper understanding of the disease pathology is needed, including discovery of accurate biomarkers for MS. Herein we review putative biomarkers of MS relating to neurodegeneration and contributions to neuropathology, with particular focus on autoimmunity. In addition, novel assessments of biomarkers not driven by hypotheses are discussed, featuring our application of advanced proteomics and metabolomics for comprehensive phenotyping of CSF and blood. This strategy allows comparison of component expression levels in CSF and serum between MS and control groups. Examination of these preliminary data suggests that several CSF proteins in MS are differentially expressed, and thus, represent putative biomarkers deserving of further evaluation.

  6. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  7. Gene-based GWAS and biological pathway analysis of the resilience of executive functioning.

    Science.gov (United States)

    Mukherjee, Shubhabrata; Kim, Sungeun; Ramanan, Vijay K; Gibbons, Laura E; Nho, Kwangsik; Glymour, M Maria; Ertekin-Taner, Nilüfer; Montine, Thomas J; Saykin, Andrew J; Crane, Paul K

    2014-03-01

    Resilience in executive functioning (EF) is characterized by high EF measured by neuropsychological test performance despite structural brain damage from neurodegenerative conditions. We previously reported single nucleotide polymorphism (SNP) genome-wide association study (GWAS) results for EF resilience. Here, we report gene- and pathway-based analyses of the same resilience phenotype, using an optimal SNP-set (Sequence) Kernel Association Test (SKAT) for gene-based analyses (conservative threshold for genome-wide significance = 0.05/18,123 = 2.8 × 10(-6)) and the gene-set enrichment package GSA-SNP for biological pathway analyses (False discovery rate (FDR) resilience (p = 1.33 × 10(-7)). Genetic pathways involved with dendritic/neuron spine, presynaptic membrane, postsynaptic density, etc., were enriched with association to EF resilience. Although replication of these results is necessary, our findings indicate the potential value of gene- and pathway-based analyses in research on determinants of cognitive resilience.

  8. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    Science.gov (United States)

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.

  9. A population of deletion mutants and an integrated mapping and Exome-seq pipeline for gene discovery in maize

    Science.gov (United States)

    To better understand maize endosperm filling and maturation, we developed a novel functional genomics platform that combined Bulked Segregant RNA and Exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. Using gamma-irradiation of B73 maize to...

  10. Large-scale gene discovery in the oomycete Phytophthora infestans reveals likely components of phytopathogenicity shared with true fungi

    NARCIS (Netherlands)

    Randall, T.A.; Dwyer, R.A.; Huitema, E.; Beyer, K.; Cvitanich, C.; Kelkar, H.; Ah Fong, A.M.V.; Gates, K.; Roberts, S.; Yatzkan, E.; Gaffney, T.; Law, M.; Testa, A.; Torto-Alalibo, T.; Zhang Meng,; Zheng Li,; Mueller, E.; Windass, J.; Binder, A.; Birch, P.R.J.; Gisi, U.; Govers, F.; Gow, N.A.; Mauch, F.; West, van P.; Waugh, M.E.; Yu Jun,; Boller, T.; Kamoun, S.; Lam, S.T.; Judelson, H.S.

    2005-01-01

    o overview the gene content of the important pathogen Phytophthora infestans, large-scale cDNA and genomic sequencing was performed. A set of 75,757 high-quality expressed sequence tags (ESTs) from P. infestans was obtained from 20 cDNA libraries representing a broad range of growth conditions, stre

  11. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    Directory of Open Access Journals (Sweden)

    Karla A Salazar

    Full Text Available In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  12. Librarian-Initiated Publications Discovery: How Do Digital Depository Librarians Discover and Select Web-Based Government Publications for State Digital Depositories?

    Science.gov (United States)

    Lin, Chi-Shiou; Eschenfelder, Kristin R.

    2010-01-01

    This paper reports on a study of librarian initiated publications discovery (LIPD) in U.S. state digital depository programs using the OCLC Digital Archive to preserve web-based government publications for permanent public access. This paper describes a model of LIPD processes based on empirical investigations of four OCLC DA-based digital…

  13. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens

    Science.gov (United States)

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J.; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Vecchio, Lucia Del; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J.; Novak, Jan; Julian, Bruce A.; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P.; Gharavi, Ali G.

    2014-01-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six novel genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geo-spatial distribution of risk alleles is highly suggestive of multi-locus adaptation and the genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  14. The discovery of archaea origin phosphomannomutase in algae based on the algal transcriptome

    Institute of Scientific and Technical Information of China (English)

    FENG Yanjing; CHI Shan; LIU Cui; CHEN Shengping; YU Jun; WANG Xumin; LIU Tao

    2014-01-01

    Phosphomannomutase (PMM;EC 5.4.2.8) is an enzyme that catalyzes the interconversion reaction between mannose-6-phosphate and mannose-1-phosphate. However, its systematic molecular and functional in-vestigations in algae have not hitherto been reported. In this work, with the accomplishment of the 1 000 Plant Project (OneKP) in which more than 218 species of Chromista, including 19 marine phaeophytes, 22 marine rhodophytes, 171 chlorophytes, 5 cryptophytes, 4 haptophytes, and 5 glaucophytes were sequenced, we used a gene analysis method to analyze the PMM gene sequences in algae and confirm the existence of the PMM gene in the transcriptomic sequencing data of Rhodophyta and Ochrophyta. Our results showed that only one type of PMM with four conserved motifs exists in Chromista which is similar to human PMM. Moreover, the phylogenetic tree revealed that algae PMM possibly originated from archaea.

  15. A global test for gene-gene interactions based on random matrix theory.

    Science.gov (United States)

    Frost, H Robert; Amos, Christopher I; Moore, Jason H

    2016-12-01

    Statistical interactions between markers of genetic variation, or gene-gene interactions, are believed to play an important role in the etiology of many multifactorial diseases and other complex phenotypes. Unfortunately, detecting gene-gene interactions is extremely challenging due to the large number of potential interactions and ambiguity regarding marker coding and interaction scale. For many data sets, there is insufficient statistical power to evaluate all candidate gene-gene interactions. In these cases, a global test for gene-gene interactions may be the best option. Global tests have much greater power relative to multiple individual interaction tests and can be used on subsets of the markers as an initial filter prior to testing for specific interactions. In this paper, we describe a novel global test for gene-gene interactions, the global epistasis test (GET), that is based on results from random matrix theory. As we show via simulation studies based on previously proposed models for common diseases including rheumatoid arthritis, type 2 diabetes, and breast cancer, our proposed GET method has superior performance characteristics relative to existing global gene-gene interaction tests. A glaucoma GWAS data set is used to demonstrate the practical utility of the GET method.

  16. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene.

    Science.gov (United States)

    Harrison, Robert L; Rowley, Daniel L; Funk, C Joel

    2016-01-01

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the "b" clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer.

  17. Discovery of New Anti-Schistosomal Hits by Integration of QSAR-Based Virtual Screening and High Content Screening.

    Science.gov (United States)

    Neves, Bruno J; Dantas, Rafael F; Senger, Mario R; Melo-Filho, Cleber C; Valente, Walter C G; de Almeida, Ana C M; Rezende-Neto, João M; Lima, Elid F C; Paveley, Ross; Furnham, Nicholas; Muratov, Eugene; Kamentsky, Lee; Carpenter, Anne E; Braga, Rodolpho C; Silva-Junior, Floriano P; Andrade, Carolina Horta

    2016-08-11

    Schistosomiasis is a debilitating neglected tropical disease, caused by flatworms of Schistosoma genus. The treatment relies on a single drug, praziquantel (PZQ), making the discovery of new compounds extremely urgent. In this work, we integrated QSAR-based virtual screening (VS) of Schistosoma mansoni thioredoxin glutathione reductase (SmTGR) inhibitors and high content screening (HCS) aiming to discover new antischistosomal agents. Initially, binary QSAR models for inhibition of SmTGR were developed and validated using the Organization for Economic Co-operation and Development (OECD) guidance. Using these models, we prioritized 29 compounds for further testing in two HCS platforms based on image analysis of assay plates. Among them, 2-[2-(3-methyl-4-nitro-5-isoxazolyl)vinyl]pyridine and 2-(benzylsulfonyl)-1,3-benzothiazole, two compounds representing new chemical scaffolds have activity against schistosomula and adult worms at low micromolar concentrations and therefore represent promising antischistosomal hits for further hit-to-lead optimization.

  18. Identifying disease feature genes based on cellular localized gene functional modules and regulation networks

    Institute of Scientific and Technical Information of China (English)

    ZHANG Min; ZHU Jing; GUO Zheng; LI Xia; YANG Da; WANG Lei; RAO Shaoqi

    2006-01-01

    Identifying disease-relevant genes and functional modules, based on gene expression profiles and gene functional knowledge, is of high importance for studying disease mechanisms and subtyping disease phenotypes. Using gene categories of biological process and cellular component in Gene Ontology, we propose an approach to selecting functional modules enriched with differentially expressed genes, and identifying the feature functional modules of high disease discriminating abilities. Using the differentially expressed genes in each feature module as the feature genes, we reveal the relevance of the modules to the studied diseases. Using three datasets for prostate cancer, gastric cancer, and leukemia, we have demonstrated that the proposed modular approach is of high power in identifying functionally integrated feature gene subsets that are highly relevant to the disease mechanisms. Our analysis has also shown that the critical disease-relevant genes might be better recognized from the gene regulation network, which is constructed using the characterized functional modules, giving important clues to the concerted mechanisms of the modules responding to complex disease states. In addition, the proposed approach to selecting the disease-relevant genes by jointly considering the gene functional knowledge suggests a new way for precisely classifying disease samples with clear biological interpretations, which is critical for the clinical diagnosis and the elucidation of the pathogenic basis of complex diseases.

  19. In silico network topology-based prediction of gene essentiality

    CERN Document Server

    da Silva, Joao Paulo Muller; Mombach, Jose Carlos Merino; Vieira, Renata; da Silva, Jose Guliherme Camargo; Lemke, Ney; Sinigaglia, Marialva

    2007-01-01

    The identification of genes essential for survival is important for the understanding of the minimal requirements for cellular life and for drug design. As experimental studies with the purpose of building a catalog of essential genes for a given organism are time-consuming and laborious, a computational approach which could predict gene essentiality with high accuracy would be of great value. We present here a novel computational approach, called NTPGE (Network Topology-based Prediction of Gene Essentiality), that relies on network topology features of a gene to estimate its essentiality. The first step of NTPGE is to construct the integrated molecular network for a given organism comprising protein physical, metabolic and transcriptional regulation interactions. The second step consists in training a decision tree-based machine learning algorithm on known essential and non-essential genes of the organism of interest, considering as learning attributes the network topology information for each of these genes...

  20. Development of urinary pseudotargeted LC-MS-based metabolomics method and its application in hepatocellular carcinoma biomarker discovery.

    Science.gov (United States)

    Shao, Yaping; Zhu, Bin; Zheng, Ruiyin; Zhao, Xinjie; Yin, Peiyuan; Lu, Xin; Jiao, Binghua; Xu, Guowang; Yao, Zhenzhen

    2015-02-01

    Hepatocellular carcinoma (HCC) is one of the pestilent malignancies leading to cancer-related death. Discovering effective biomarkers for HCC diagnosis is an urgent demand. To identify potential metabolite biomarkers, we developed a urinary pseudotargeted method based on liquid chromatography-hybrid triple quadrupole linear ion trap mass spectrometry (LC-QTRAP MS). Compared with nontargeted method, the pseudotargeted method can achieve better data quality, which benefits differential metabolites discovery. The established method was applied to cirrhosis (CIR) and HCC investigation. It was found that urinary nucleosides, bile acids, citric acid, and several amino acids were significantly changed in liver disease groups compared with the controls, featuring the dysregulation of purine metabolism, energy metabolism, and amino metabolism in liver diseases. Furthermore, some metabolites such as cyclic adenosine monophosphate, glutamine, and short- and medium-chain acylcarnitines were the differential metabolites of HCC and CIR. On the basis of binary logistic regression, butyrylcarnitine (carnitine C4:0) and hydantoin-5-propionic acid were defined as combinational markers to distinguish HCC from CIR. The area under curve was 0.786 and 0.773 for discovery stage and validation stage samples, respectively. These data show that the established pseudotargeted method is a complementary one of targeted and nontargeted methods for metabolomics study.

  1. Gene Discovery and Advances in Finger millet [Eleusine coracana (L. Gaertn.] Genomics - An Important Nutri-cereal of Future

    Directory of Open Access Journals (Sweden)

    Salej Sood

    2016-11-01

    Full Text Available The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [Eleusine coracana (L. Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologues from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2-4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet.

  2. Genetic Determinants for Promoter Hypermethylation in the Lungs of Smokers: A Candidate Gene-Based Study

    OpenAIRE

    Leng, Shuguang; Stidley, Christine A.; Liu, Yushi; Edlund, Christopher K.; Willink, Randall P.; Han, Younghun; Landi, Maria Teresa; Thun, Michael; Picchi, Maria A.; Bruse, Shannon E.; Crowell, Richard E.; Van Den Berg, David; Neil E Caporaso; Amos, Christopher I.; Siegfried, Jill M.

    2011-01-01

    The detection of tumor suppressor gene promoter methylation in sputum-derived exfoliated cells predicts early lung cancer. Here we identified genetic determinants for this epigenetic process and examined their biological effects on gene regulation. A two-stage approach involving discovery and replication was employed to assess the association between promoter hypermethylation of a 12-gene panel and common variation in 40 genes involved in carcinogen metabolism, regulation of methylation, and ...

  3. Automated Sample Preparation Platform for Mass Spectrometry-Based Plasma Proteomics and Biomarker Discovery

    Directory of Open Access Journals (Sweden)

    Vilém Guryča

    2014-03-01

    Full Text Available The identification of novel biomarkers from human plasma remains a critical need in order to develop and monitor drug therapies for nearly all disease areas. The discovery of novel plasma biomarkers is, however, significantly hampered by the complexity and dynamic range of proteins within plasma, as well as the inherent variability in composition from patient to patient. In addition, it is widely accepted that most soluble plasma biomarkers for diseases such as cancer will be represented by tissue leakage products, circulating in plasma at low levels. It is therefore necessary to find approaches with the prerequisite level of sensitivity in such a complex biological matrix. Strategies for fractionating the plasma proteome have been suggested, but improvements in sensitivity are often negated by the resultant process variability. Here we describe an approach using multidimensional chromatography and on-line protein derivatization, which allows for higher sensitivity, whilst minimizing the process variability. In order to evaluate this automated process fully, we demonstrate three levels of processing and compare sensitivity, throughput and reproducibility. We demonstrate that high sensitivity analysis of the human plasma proteome is possible down to the low ng/mL or even high pg/mL level with a high degree of technical reproducibility.

  4. An Innovative Cell Microincubator for Drug Discovery Based on 3D Silicon Structures

    Directory of Open Access Journals (Sweden)

    Francesca Aredia

    2016-01-01

    Full Text Available We recently employed three-dimensional (3D silicon microstructures (SMSs consisting in arrays of 3 μm-thick silicon walls separated by 50 μm-deep, 5 μm-wide gaps, as microincubators for monitoring the biomechanical properties of tumor cells. They were here applied to investigate the in vitro behavior of HT1080 human fibrosarcoma cells driven to apoptosis by the chemotherapeutic drug Bleomycin. Our results, obtained by fluorescence microscopy, demonstrated that HT1080 cells exhibited a great ability to colonize the narrow gaps. Remarkably, HT1080 cells grown on 3D-SMS, when treated with the DNA damaging agent Bleomycin under conditions leading to apoptosis, tended to shrink, reducing their volume and mimicking the normal behavior of apoptotic cells, and were prone to leave the gaps. Finally, we performed label-free detection of cells adherent to the vertical silicon wall, inside the gap of 3D-SMS, by exploiting optical low coherence reflectometry using infrared, low power radiation. This kind of approach may become a new tool for increasing automation in the drug discovery area. Our results open new perspectives in view of future applications of the 3D-SMS as the core element of a lab-on-a-chip suitable for screening the effect of new molecules potentially able to kill tumor cells.

  5. Potential of Glutamate-Based Drug Discovery for Next Generation Antidepressants

    Directory of Open Access Journals (Sweden)

    Shigeyuki Chaki

    2015-09-01

    Full Text Available Recently, ketamine has been demonstrated to exert rapid-acting antidepressant effects in patients with depression, including those with treatment-resistant depression, and this discovery has been regarded as the most significant advance in drug development for the treatment of depression in over 50 years. To overcome unwanted side effects of ketamine, numerous approaches targeting glutamatergic systems have been vigorously investigated. For example, among agents targeting the NMDA receptor, the efficacies of selective GluN2B receptor antagonists and a low-trapping antagonist, as well as glycine site modulators such as GLYX-13 and sarcosine have been demonstrated clinically. Moreover, agents acting on metabotropic glutamate receptors, such as mGlu2/3 and mGlu5 receptors, have been proposed as useful approaches to mimicking the antidepressant effects of ketamine. Neural and synaptic mechanisms mediated through the antidepressant effects of ketamine have been being delineated, most of which indicate that ketamine improves abnormalities in synaptic transmission and connectivity observed in depressive states via the AMPA receptor and brain-derived neurotrophic factor-dependent mechanisms. Interestingly, some of the above agents may share some neural and synaptic mechanisms with ketamine. These studies should provide important insights for the development of superior pharmacotherapies for depression with more potent and faster onsets of actions.

  6. DNA-energetics-based analyses suggest additional genes in prokaryotes

    Indian Academy of Sciences (India)

    Garima Khandelwal; Jalaj Gupta; B Jayaram

    2012-07-01

    We present here a novel methodology for predicting new genes in prokaryotic genomes on the basis of inherent energetics of DNA. Regions of higher thermodynamic stability were identified, which were filtered based on already known annotations to yield a set of potentially new genes. These were then processed for their compatibility with the stereo-chemical properties of proteins and tripeptide frequencies of proteins in Swissprot data, which results in a reliable set of new genes in a genome. Quite surprisingly, the methodology identifies new genes even in well-annotated genomes. Also, the methodology can handle genomes of any GC-content, size and number of annotated genes.

  7. Study on gene sensor based on primer extension

    Institute of Scientific and Technical Information of China (English)

    陈誉华; 宋今丹; 李大为

    1997-01-01

    Based on the fact that the resonant frequency of a piezoelectric crystal is the function of its surface deposit, and that the primer extends after it hybridizes with the template, the primer extension gene sensor technique was developed. The prominent feature of the technique is that fast and sensitive frequency signals are used as the monitoring system of gene hybridization and primer strand extension. Results show that this technique may be used in homologous analysis of nucleic acid, trace DNA detection, and determining the integration of DNA. It may also be used for isolation of target gene, gene mutation analysis, and predicting the location of a gene in its genome.

  8. Gene-based and semantic structure of the Gene Ontology as a complex network

    Science.gov (United States)

    Coronnello, Claudia; Tumminello, Michele; Miccichè, Salvatore

    2016-09-01

    The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The Gene Ontology (GO) is constantly evolving over time. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. Here we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium. Moreover, the GO is a natural example of bipartite network of terms and genes. Here we are interested in studying the properties of the projected network of terms, i.e. a gene-based weighted network of GO terms, in which a link between any two terms is set if at least one gene is annotated in both terms. One aim of the present paper is to compare the structural properties of the semantic and the gene-based network. The relative importance of terms is very similar in the two networks, but the community structure changes. We show that in some cases GO terms that appear to be distinct from a semantic point of view are instead connected, and appear in the same community when considering their gene content. The identification of such gene-based communities of terms might therefore be the basis of a simple protocol aiming at improving the semantic structure of GO. Information about terms that share large gene content might also be important from a biomedical point of view, as it might reveal how genes over-expressed in a certain term also affect other biological processes, molecular functions and cellular components not directly linked according to GO semantics.

  9. Gene-Set Local Hierarchical Clustering (GSLHC--A Gene Set-Based Approach for Characterizing Bioactive Compounds in Terms of Biological Functional Groups.

    Directory of Open Access Journals (Sweden)

    Feng-Hsiang Chung

    Full Text Available Gene-set-based analysis (GSA, which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA, which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap, an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap, in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website http://cloudr.ncu.edu.tw/gslhc/ contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases.

  10. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2015-01-01

    as their central concepts and conceptualization of the entrepreneurial function. On this basis we discuss three central themes that cut across the four alternatives: process, uncertainty, and agency. These themes provide new foci for entrepreneurship research and can help to generate new research questions......In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well...

  11. Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing

    Directory of Open Access Journals (Sweden)

    Power Deborah M

    2010-06-01

    Full Text Available Abstract Background The Antarctic clam, Laternula elliptica, is an infaunal stenothermal bivalve mollusc with a circumpolar distribution. It plays a significant role in bentho-pelagic coupling and hence has been proposed as a sentinel species for climate change monitoring. Previous studies have shown that this mollusc displays a high level of plasticity with regard to shell deposition and damage repair against a background of genetic homogeneity. The Southern Ocean has amongst the lowest present-day CaCO3 saturation rate of any ocean region, and is predicted to be among the first to become undersaturated under current ocean acidification scenarios. Hence, this species presents as an ideal candidate for studies into the processes of calcium regulation and shell deposition in our changing ocean environments. Results 454 sequencing of L. elliptica mantle tissue generated 18,290 contigs with an average size of 535 bp (ranging between 142 bp-5.591 kb. BLAST sequence similarity searching assigned putative function to 17% of the data set, with a significant proportion of these transcripts being involved in binding and potentially of a secretory nature, as defined by GO molecular function and biological process classifications. These results indicated that the mantle is a transcriptionally active tissue which is actively proliferating. All transcripts were screened against an in-house database of genes shown to be involved in extracellular matrix formation and calcium homeostasis in metazoans. Putative identifications were made for a number of classical shell deposition genes, such as tyrosinase, carbonic anhydrase and metalloprotease 1, along with novel members of the family 2 G-Protein Coupled Receptors (GPCRs. A membrane transport protein (SEC61 was also characterised and this demonstrated the utility of the clam sequence data as a resource for examining cold adapted amino acid substitutions. The sequence data contained 46,235 microsatellites and 13

  12. 基于知识发现的范例推理系统%Case-Based Reasoning System Based on Knowledge Discovery

    Institute of Scientific and Technical Information of China (English)

    倪志伟; 蔡庆生

    2003-01-01

    Nowadays the research and exploitation of the case-based system are getting more and more attention.Case-Based Reasoning (CBR) is a strategy for solving the object cases based on the source cases that are prompted bythe object ones. CBR is not only a psychological theory for human knowledge, but will be a new cornerstone of theintelligent computer system technology. The case-based system is adopted in more and more application fields in orderto obtain better results, especially in the fields with ill-defined and no expert knowledge. But there is a lot of knowl-edge required in CBR, and we are also faced with the same knowledge acquisition bottleneck as in the expert systems.Data Mining (DM) and Knowledge Discovery in Database (KDD) are just the most useful means to solve this kind ofproblem in order to make the knowledge acquisition more automated . In this paper, we discuss the data mining tech-nology in CBR, especially we raise knowledge discovery in case base (KDC) and discuss this concept in detail. Final-ly, the structure of CBR based on DM is put forward.

  13. Bridging the gap between data acquisition and inference ontologies: toward ontology-based link discovery

    Science.gov (United States)

    Goldstein, Michel L.; Morris, Steven A.; Yen, Gary G.

    2003-09-01

    Bridging the gap between low level ontologies used for data acquisition and high level ontologies used for inference is essential to enable the discovery of high-level links between low-level entities. This is of utmost importance in many applications, where the semantic distance between the observable evidence and the target relations is large. Examples of these applications would be detection of terrorist activity, crime analysis, and technology monitoring, among others. Currently this inference gap has been filled by expert knowledge. However, with the increase of the data and system size, it has become too costly to perform such manual inference. This paper proposes a semi-automatic system to bridge the inference gap using network correlation methods, similar to Bayesian Belief Networks, combined with hierarchical clustering, to group and organize data so that experts can observe and build the inference gap ontologies quickly and efficiently, decreasing the cost of this labor-intensive process. A simple application of this method is shown here, where the co-author collaboration structure ontology is inferred from the analysis of a collection of journal publications on the subject of anthrax. This example uncovers a co-author collaboration structures (a well defined ontology) from a scientific publication dataset (also a well defined ontology). Nevertheless, the evidence of author collaboration is poorly defined, requiring the use of evidence from keywords, citations, publication dates, and paper co-authorship. The proposed system automatically suggests candidate collaboration group patterns for evaluation by experts. Using an intuitive graphic user interface, these experts identify, confirm and refine the proposed ontologies and add them to the ontology database to be used in subsequent processes.

  14. Comparison of sequencing based CNV discovery methods using monozygotic twin quartets.

    Directory of Open Access Journals (Sweden)

    Marc-André Legault

    Full Text Available The advent of high throughput sequencing methods breeds an important amount of technical challenges. Among those is the one raised by the discovery of copy-number variations (CNVs using whole-genome sequencing data. CNVs are genomic structural variations defined as a variation in the number of copies of a large genomic fragment, usually more than one kilobase. Here, we aim to compare different CNV calling methods in order to assess their ability to consistently identify CNVs by comparison of the calls in 9 quartets of identical twin pairs. The use of monozygotic twins provides a means of estimating the error rate of each algorithm by observing CNVs that are inconsistently called when considering the rules of Mendelian inheritance and the assumption of an identical genome between twins. The similarity between the calls from the different tools and the advantage of combining call sets were also considered.ERDS and CNVnator obtained the best performance when considering the inherited CNV rate with a mean of 0.74 and 0.70, respectively. Venn diagrams were generated to show the agreement between the different algorithms, before and after filtering out familial inconsistencies. This filtering revealed a high number of false positives for CNVer and Breakdancer. A low overall agreement between the methods suggested a high complementarity of the different tools when calling CNVs. The breakpoint sensitivity analysis indicated that CNVnator and ERDS achieved better resolution of CNV borders than the other tools. The highest inherited CNV rate was achieved through the intersection of these two tools (81%.This study showed that ERDS and CNVnator provide good performance on whole genome sequencing data with respect to CNV consistency across families, CNV breakpoint resolution and CNV call specificity. The intersection of the calls from the two tools would be valuable for CNV genotyping pipelines.

  15. De novo characterization of the Dialeurodes citri transcriptome: mining genes involved in stress resistance and simple sequence repeats (SSRs) discovery.

    Science.gov (United States)

    Chen, E-H; Wei, D-D; Shen, G-M; Yuan, G-R; Bai, P-P; Wang, J-J

    2014-02-01

    The citrus whitefly, Dialeurodes citri (Ashmead), is one of the three economically important whitefly species that infest citrus plants around the world; however, limited genetic research has been focused on D. citri, partly because of lack of genomic resources. In this study, we performed de novo assembly of a transcriptome using Illumina paired-end sequencing technology (Illumina Inc., San Diego, CA, USA). In total, 36,766 unigenes with a mean length of 497 bp were identified. Of these unigenes, we identified 17,788 matched known proteins in the National Center for Biotechnology Information database, as determined by Blast search, with 5731, 4850 and 14,441 unigenes assigned to clusters of orthologous groups (COG), gene ontology (GO), and SwissProt, respectively. In total, 7507 unigenes were assigned to 308 known pathways. In-depth analysis of the data showed that 117 unigenes were identified as potentially involved in the detoxification of xenobiotics and 67 heat shock protein (Hsp) genes were associated with environmental stress. In addition, these enzymes were searched against the GO and COG database, and the results showed that the three major detoxification enzymes and Hsps were classified into 18 and 3, 6, and 8 annotations, respectively. In addition, 149 simple sequence repeats were detected. The results facilitate the investigation of molecular resistance mechanisms to insecticides and environmental stress, and contribute to molecular marker development. The findings greatly improve our genetic understanding of D. citri, and lay the foundation for future functional genomics studies on this species.

  16. Chronicles in drug discovery.

    Science.gov (United States)

    Davies, Shelley L; Moral, Maria Angels; Bozzo, Jordi

    2007-03-01

    Chronicles in Drug Discovery features special interest reports on advances in drug discovery. This month we highlight agents that target and deplete immunosuppressive regulatory T cells, which are produced by tumor cells to hinder innate immunity against, or chemotherapies targeting, tumor-associated antigens. Antiviral treatments for respiratory syncytial virus, a severe and prevalent infection in children, are limited due to their side effect profiles and cost. New strategies currently under clinical development include monoclonal antibodies, siRNAs, vaccines and oral small molecule inhibitors. Recent therapeutic lines for Huntington's disease include gene therapies that target the mutated human huntingtin gene or deliver neuroprotective growth factors and cellular transplantation in apoptotic regions of the brain. Finally, we highlight the antiinflammatory and antinociceptive properties of new compounds targeting the somatostatin receptor subtype sst4, which warrant further study for their potential application as clinical analgesics.

  17. Towards agile large-scale predictive modelling in drug discovery with flow-based programming design principles.

    Science.gov (United States)

    Lampa, Samuel; Alvarsson, Jonathan; Spjuth, Ola

    2016-01-01

    Predictive modelling in drug discovery is challenging to automate as it often contains multiple analysis steps and might involve cross-validation and parameter tuning that create complex dependencies between tasks. With large-scale data or when using computationally demanding modelling methods, e-infrastructures such as high-performance or cloud computing are required, adding to the existing challenges of fault-tolerant automation. Workflow management systems can aid in many of these challenges, but the currently available systems are lacking in the functionality needed to enable agile and flexible predictive modelling. We here present an approach inspired by elements of the flow-based programming paradigm, implemented as an extension of the Luigi system which we name SciLuigi. We also discuss the experiences from using the approach when modelling a large set of biochemical interactions using a shared computer cluster.Graphical abstract.

  18. Discovery of TNF inhibitors from a DNA-encoded chemical library based on diels-alder cycloaddition.

    Science.gov (United States)

    Buller, Fabian; Zhang, Yixin; Scheuermann, Jörg; Schäfer, Juliane; Bühlmann, Peter; Neri, Dario

    2009-10-30

    DNA-encoded chemical libraries are promising tools for the discovery of ligands toward protein targets of pharmaceutical relevance. DNA-encoded small molecules can be enriched in affinity-based selections and their unique DNA "barcode" allows the amplification and identification by high-throughput sequencing. We describe selection experiments using a DNA-encoded 4000-compound library generated by Diels-Alder cycloadditions. High-throughput sequencing enabled the identification and relative quantification of library members before and after selection. Sequence enrichment profiles corresponding to the "bar-coded" library members were validated by affinity measurements of single compounds. We were able to affinity mature trypsin inhibitors and identify a series of albumin binders for the conjugation of pharmaceuticals. Furthermore, we discovered a ligand for the antiapoptotic Bcl-xL protein and a class of tumor necrosis factor (TNF) binders that completely inhibited TNF-mediated killing of L-M fibroblasts in vitro.

  19. A general co-expression network-based approach to gene expression analysis: comparison and applications

    Directory of Open Access Journals (Sweden)

    Zhang Weixiong

    2010-02-01

    Full Text Available Abstract Background Co-expression network-based approaches have become popular in analyzing microarray data, such as for detecting functional gene modules. However, co-expression networks are often constructed by ad hoc methods, and network-based analyses have not been shown to outperform the conventional cluster analyses, partially due to the lack of an unbiased evaluation metric. Results Here, we develop a general co-expression network-based approach for analyzing both genes and samples in microarray data. Our approach consists of a simple but robust rank-based network construction method, a parameter-free module discovery algorithm and a novel reference network-based metric for module evaluation. We report some interesting topological properties of rank-based co-expression networks that are very different from that of value-based networks in the literature. Using a large set of synthetic and real microarray data, we demonstrate the superior performance of our approach over several popular existing algorithms. Applications of our approach to yeast, Arabidopsis and human cancer microarray data reveal many interesting modules, including a fatal subtype of lymphoma and a gene module regulating yeast telomere integrity, which were missed by the existing methods. Conclusions We demonstrated that our novel approach is very effective in discovering the modular structures in microarray data, both for genes and for samples. As the method is essentially parameter-free, it may be applied to large data sets where the number of clusters is difficult to estimate. The method is also very general and can be applied to other types of data. A MATLAB implementation of our algorithm can be downloaded from http://cs.utsa.edu/~jruan/Software.html.

  20. A contig-based strategy for the genome-wide discovery of microRNAs without complete genome resources.

    Directory of Open Access Journals (Sweden)

    Jun-Zhi Wen

    Full Text Available MicroRNAs (miRNAs are important regulators of many cellular processes and exist in a wide range of eukaryotes. High-throughput sequencing is a mainstream method of miRNA identification through which it is possible to obtain the complete small RNA profile of an organism. Currently, most approaches to miRNA identification rely on a reference genome for the prediction of hairpin structures. However, many species of economic and phylogenetic importance are non-model organisms without complete genome sequences, and this limits miRNA discovery. Here, to overcome this limitation, we have developed a contig-based miRNA identification strategy. We applied this method to a triploid species of edible banana (GCTCV-119, Musa spp. AAA group and identified 180 pre-miRNAs and 314 mature miRNAs, which is three times more than those were predicted by the available dataset-based methods (represented by EST+GSS. Based on the recently published miRNA data set of Musa acuminate, the recall rate and precision of our strategy are estimated to be 70.6% and 92.2%, respectively, significantly better than those of EST+GSS-based strategy (10.2% and 50.0%, respectively. Our novel, efficient and cost-effective strategy facilitates the study of the functional and evolutionary role of miRNAs, as well as miRNA-based molecular breeding, in non-model species of economic or evolutionary interest.

  1. First discovery of two polyketide synthase genes for mitorubrinic acid and mitorubrinol yellow pigment biosynthesis and implications in virulence of Penicillium marneffei.

    Directory of Open Access Journals (Sweden)

    Patrick C Y Woo

    Full Text Available BACKGROUND: The genome of P. marneffei, the most important thermal dimorphic fungus causing respiratory, skin and systemic mycosis in China and Southeast Asia, possesses 23 polyketide synthase (PKS genes and 2 polyketide synthase nonribosomal peptide synthase hybrid (PKS-NRPS genes, which is of high diversity compared to other thermal dimorphic pathogenic fungi. We hypothesized that the yellow pigment in the mold form of P. marneffei could also be synthesized by one or more PKS genes. METHODOLOGY/PRINCIPAL FINDINGS: All 23 PKS and 2 PKS-NRPS genes of P. marneffei were systematically knocked down. A loss of the yellow pigment was observed in the mold form of the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants. Sequence analysis showed that PKS11 and PKS12 are fungal non-reducing PKSs. Ultra high performance liquid chromatography-photodiode array detector/electrospray ionization-quadruple time of flight-mass spectrometry (MS and MS/MS analysis of the culture filtrates of wild type P. marneffei and the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants showed that the yellow pigment is composed of mitorubrinic acid and mitorubrinol. The survival of mice challenged with the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants was significantly better than those challenged with wild type P. marneffei (P<0.05. There was also statistically significant decrease in survival of pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants compared to wild type P. marneffei in both J774 and THP1 macrophages (P<0.05. CONCLUSIONS/SIGNIFICANCE: The yellow pigment of the mold form of P. marneffei is composed of mitorubrinol and mitorubrinic acid. This represents the first discovery of PKS genes responsible for mitorubrinol and mitorubrinic acid biosynthesis. pks12 and pks11 are probably responsible for sequential use in the biosynthesis of mitorubrinol and mitorubrinic acid

  2. Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

    Science.gov (United States)

    Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

    2003-04-01

    Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.

  3. Cultivation of Hard-To-Culture Subsurface Mercury-Resistant Bacteria and Discovery of New merA Gene Sequences

    DEFF Research Database (Denmark)

    Rasmussen, Lasse Dam; Zawadsky, C.; Binnerup, Svend Jørgen

    2008-01-01

    Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach...... was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25...... of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated...

  4. MOLECULAR MODELING AND DRUG DISCOVERY OF POTENTIAL INHIBITORS FOR ANTICANCER TARGET GENE MELK (MATERNAL EMBRYONIC LEUCINE ZIPPER KINASE

    Directory of Open Access Journals (Sweden)

    Sabitha. K

    2011-12-01

    Full Text Available Maternal embryonic leucine zipper kinase (MELK, a member of the AMP serine/threonine kinase family, exhibits multiple features consistent with the potential utility of this gene as an anticancer target. Reports show that MELK functions as a cancer-specific protein kinase, and that down-regulation of MELK results in growth suppression of breast cancer cells. There are many inhibitors which bind to kinases and are in clinical trials too. In our study we have taken a library of different inhibitors and docked those using GLIDE Induced Fit. From docking result we can conclude that Syk inhibitor II, Rho kinase inhibitor IV, p38 MAP Kinase Inhibitor III, HA 1004, Dihydrochloride and IKK -2 inhibitor VI have good binding affinity towards MELK and may have anticancer activity.

  5. AAV-Based Targeting Gene Therapy

    Directory of Open Access Journals (Sweden)

    Wenfang Shi

    2008-01-01

    Full Text Available Since the first parvovirus serotype AAV2 was isolated from human and used as a vector for gene therapy application, there have been significant progresses in AAV vector development. AAV vectors have been extensively investigated in gene therapy for a broad application. AAV vectors have been considered as the first choice of vector due to efficient infectivity, stable expression and non-pathogenicity. However, the untoward events in AAV mediated in vivo gene therapy studies proposed the new challenges for their further applications. Deep understanding of the viral life cycle, viral structure and replication, infection mechanism and efficiency of AAV DNA integration, in terms of contributing viral, host-cell factors and circumstances would promote to evaluate the advantages and disadvantages and provide more insightful information for the possible clinical applications. In this review, main effort will be focused on the recent progresses in gene delivery to the target cells via receptor-ligand interaction and DNA specific integration regulation. Furthermore AAV receptor and virus particle intracellular trafficking are also discussed.

  6. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    Directory of Open Access Journals (Sweden)

    Terao Keiji

    2002-12-01

    Full Text Available Abstract Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies.

  7. Drug discovery in a multidimensional world: systems, patterns, and networks.

    Science.gov (United States)

    Dudley, Joel T; Schadt, Eric; Sirota, Marina; Butte, Atul J; Ashley, Euan

    2010-10-01

    Despite great strides in revealing and understanding the physiological and molecular bases of cardiovascular disease, efforts to translate this understanding into needed therapeutic interventions continue to lag far behind the initial discoveries. Although pharmaceutical companies continue to increase investments into research and development, the number of drugs gaining federal approval is in decline. Many factors underlie these trends, and a vast number of technological and scientific innovations are being sought through efforts to reinvigorate drug discovery pipelines. Recent advances in molecular profiling technologies and development of sophisticated computational approaches for analyzing these data are providing new, systems-oriented approaches towards drug discovery. Unlike the traditional approach to drug discovery which is typified by a one-drug-one-target mindset, systems-oriented approaches to drug discovery leverage the parallelism and high-dimensionality of the molecular data to construct more comprehensive molecular models that aim to model broader bimolecular systems. These models offer a means to explore complex molecular states (e.g., disease) where thousands to millions of molecular entities comprising multiple molecular data types (e.g., proteomics and gene expression) can be evaluated simultaneously as components of a cohesive biomolecular system. In this paper, we discuss emerging approaches towards systems-oriented drug discovery and contrast these efforts with the traditional, unidimensional approach to drug discovery. We also highlight several applications of these system-oriented approaches across various aspects of drug discovery, including target discovery, drug repositioning and drug toxicity. When available, specific applications to cardiovascular drug discovery are highlighted and discussed.

  8. Integrative Genomics-Based Discovery of Novel Regulators of the Innate Antiviral Response.

    Science.gov (United States)

    van der Lee, Robin; Feng, Qian; Langereis, Martijn A; Ter Horst, Rob; Szklarczyk, Radek; Netea, Mihai G; Andeweg, Arno C; van Kuppeveld, Frank J M; Huynen, Martijn A

    2015-10-01

    The RIG-I-like receptor (RLR) pathway is essential for detecting cytosolic viral RNA to trigger the production of type I interferons (IFNα/β) that initiate an innate antiviral response. Through systematic assessment of a wide variety of genomics data, we discovered 10 molecular signatures of known RLR pathway components that collectively predict novel members. We demonstrate that RLR pathway genes, among others, tend to evolve rapidly, interact with viral proteins, contain a limited set of protein domains, are regulated by specific transcription factors, and form a tightly connected interaction network. Using a Bayesian approach to integrate these signatures, we propose likely novel RLR regulators. RNAi knockdown experiments revealed a high prediction accuracy, identifying 94 genes among 187 candidates tested (~50%) that affected viral RNA-induced production of IFNβ. The discovered antiviral regulators may participate in a wide range of processes that highlight the complexity of antiviral defense (e.g. MAP3K11, CDK11B, PSMA3, TRIM14, HSPA9B, CDC37, NUP98, G3BP1), and include uncharacterized factors (DDX17, C6orf58, C16orf57, PKN2, SNW1). Our validated RLR pathway list (http://rlr.cmbi.umcn.nl/), obtained using a combination of integrative genomics and experiments, is a new resource for innate antiviral immunity research.

  9. Integrative Genomics-Based Discovery of Novel Regulators of the Innate Antiviral Response.

    Directory of Open Access Journals (Sweden)

    Robin van der Lee

    2015-10-01

    Full Text Available The RIG-I-like receptor (RLR pathway is essential for detecting cytosolic viral RNA to trigger the production of type I interferons (IFNα/β that initiate an innate antiviral response. Through systematic assessment of a wide variety of genomics data, we discovered 10 molecular signatures of known RLR pathway components that collectively predict novel members. We demonstrate that RLR pathway genes, among others, tend to evolve rapidly, interact with viral proteins, contain a limited set of protein domains, are regulated by specific transcription factors, and form a tightly connected interaction network. Using a Bayesian approach to integrate these signatures, we propose likely novel RLR regulators. RNAi knockdown experiments revealed a high prediction accuracy, identifying 94 genes among 187 candidates tested (~50% that affected viral RNA-induced production of IFNβ. The discovered antiviral regulators may participate in a wide range of processes that highlight the complexity of antiviral defense (e.g. MAP3K11, CDK11B, PSMA3, TRIM14, HSPA9B, CDC37, NUP98, G3BP1, and include uncharacterized factors (DDX17, C6orf58, C16orf57, PKN2, SNW1. Our validated RLR pathway list (http://rlr.cmbi.umcn.nl/, obtained using a combination of integrative genomics and experiments, is a new resource for innate antiviral immunity research.

  10. Increased complexity of gene structure and base composition in vertebrates

    Institute of Scientific and Technical Information of China (English)

    Ying Wu; Huizhong Yuan; Shengjun Tan; Jian-Qun Chen; Dacheng Tian; Haiwang Yang

    2011-01-01

    How the structure and base composition of genes changed with the evolution of vertebrates remains a puzzling question. Here we analyzed 895 orthologous protein-coding genes in six multicellular animals: human, chicken, zebrafish, sea squirt, fruit fly, and worm. Our analyses reveal that many gene regions, particularly intron and 3' UTR, gradually expanded throughout the evolution of vertebrates from their invertebrate ancestors, and that the number of exons per gene increased. Studies based on all protein-coding genes in each genome provide consistent results.We also find that GC-content increased in many gene regions (especially 5' UTR) in the evolution of endotherms, except in coding-exons.Analysis of individual genomes shows that 3′ UTR demonstrated stronger length and CC-content correlation with intron than 5' UTR, and gene with large intron in all six species demonstrated relatively similar GC-content. Our data indicates a great increase in complexity in vertebrate genes and we propose that the requirement for morphological and functional changes is probably the driving force behind the evolution of structure and base composition complexity in multicellular animal genes.

  11. From amplification to gene in thyroid cancer: A high-resolution mapped bacterial-artificial-chromosome resource for cancer chromosome aberrations guides gene discovery after comparative genome hybridization

    Energy Technology Data Exchange (ETDEWEB)

    Chen, X.N.; Gonsky, R.; Korenberg, J.R. [UCLA School of Medicine, Los Angeles, CA (United States). Cedars-Sinai Research Inst.; Knauf, J.A.; Fagin, J.A. [Univ. of Cincinnati, OH (United States). Div. of Endocrinology/Metabolism; Wang, M.; Lai, E.H. [Univ. of North Carolina, Chapel Hill, NC (United States). Dept. of Pharmacology; Chissoe, S. [Washington Univ. School of Medicine, St. Louis, MO (United States). Genome Sequencing

    1998-08-01

    Chromosome rearrangements associated with neoplasms provide a rich resource for definition of the pathways of tumorigenesis. The power of comparative genome hybridization (CGH) to identify novel genes depends on the existence of suitable markers, which are lacking throughout most of the genome. The authors now report a general approach that translates CGH data into higher-resolution genomic-clone data that are then used to define the genes located in aneuploid regions. They used CGH to study 33 thyroid-tumor DNAs and two tumor-cell-line DNAs. The results revealed amplifications of chromosome band 2p21, with less-intense amplification on 2p13, 19q13.1, and 1p36 and with least-intense amplification on 1p34, 1q42, 5q31, 5q33-34, 9q32-34, and 14q32. To define the 2p21 region amplified, a dense array of 373 FISH-mapped chromosome 2 bacterial artificial chromosomes (BACs) was constructed, and 87 of these were hybridized to a tumor-cell line. Four BACs carried genomic DNA that was amplified in these cells. The maximum amplified region was narrowed to 3--6 Mb by multicolor FISH with the flanking BACs, and the minimum amplicon size was defined by a contig of 420 kb. Sequence analysis of the amplified BAC 1D9 revealed a fragment of the gene, encoding protein kinase C epsilon (PKC{epsilon}), that was then shown to be amplified and rearranged in tumor cells. In summary, CGH combined with a dense mapped resource of BACs and large-scale sequencing has led directly to the definition of PKC{epsilon} as a previously unmapped candidate gene involved in thyroid tumorigenesis.

  12. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua: Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs.

    Directory of Open Access Journals (Sweden)

    Rune Andreassen

    Full Text Available Atlantic cod (Gadus morhua is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs.The discovery analysis revealed 490 mature miRNAs (401 unique sequences along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1-5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs.The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we have identified

  13. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics.

    Science.gov (United States)

    Lamparter, David; Marbach, Daniel; Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.

  14. Representation Discovery using Harmonic Analysis

    CERN Document Server

    Mahadevan, Sridhar

    2008-01-01

    Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu

  15. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    Directory of Open Access Journals (Sweden)

    Bibby Kyle

    2011-03-01

    Full Text Available Abstract Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology (KO identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  16. Discovery Learning, Representation, and Explanation within a Computer-Based Simulation: Finding the Right Mix

    Science.gov (United States)

    Rieber, Lloyd P.; Tzeng, Shyh-Chii; Tribble, Kelly

    2004-01-01

    The purpose of this research was to explore how adult users interact and learn during an interactive computer-based simulation supplemented with brief multimedia explanations of the content. A total of 52 college students interacted with a computer-based simulation of Newton's laws of motion in which they had control over the motion of a simple…

  17. Cys-loop ligand-gated ion channel gene discovery in the Locusta migratoria manilensis through the neuron transcriptome.

    Science.gov (United States)

    Wang, Xin; Meng, Xiangkun; Liu, Chuanjun; Gao, Hongli; Zhang, Yixi; Liu, Zewen

    2015-05-01

    As an ideal model, Locusta migratoria manilensis (Meyen) has been widely used in the study of endocrinological and neurobiological processes. Here we created a large transcriptome of the locust neurons, which enriched ion channels whose potential for functional genetic experiments is currently limited. With high-throughput Illumina sequencing technology, we obtained more than 50 million raw reads, which were assembled into 61,056 unique sequences with average size of 737bp. Among the unigenes, a total 24,884 sequences had significant similarities with proteins in the five public databases (NR, SwissProt, GO, COG and KEGG) with a cut-off E-value of 10(-5) using BLASTx. Moreover, the number of potential genes of the cys-loop ligand-gated ion channels (LGICs) was manually curated, including 39 putative nicotinic acetylcholine receptors (nAChRs), 6 putative γ-aminobutyric acid (GABA) gated anion channels, 21 putative glutamate-gated chloride channels (GluCls) and 1 histamine-gated chloride channels (HisCls). In addition, the full-length of 11 nAChRs subunits (9 alpha and 2 beta) were obtained by RACE technique that would be helpful to further studies on nAChR neurochemistry and pharmacological aspects. To our knowledge, this is the first study to characterize the locust neuron transcriptome, which will provide a useful resource especially for future studies on the neuro-function and behavior of the locust.

  18. Discovery of binding proteins for a protein target using protein-protein docking-based virtual screening.

    Science.gov (United States)

    Zhang, Changsheng; Tang, Bo; Wang, Qian; Lai, Luhua

    2014-10-01

    Target structure-based virtual screening, which employs protein-small molecule docking to identify potential ligands, has been widely used in small-molecule drug discovery. In the present study, we used a protein-protein docking program to identify proteins that bind to a specific target protein. In the testing phase, an all-to-all protein-protein docking run on a large dataset was performed. The three-dimensional rigid docking program SDOCK was used to examine protein-protein docking on all protein pairs in the dataset. Both the binding affinity and features of the binding energy landscape were considered in the scoring function in order to distinguish positive binding pairs from negative binding pairs. Thus, the lowest docking score, the average Z-score, and convergency of the low-score solutions were incorporated in the analysis. The hybrid scoring function was optimized in the all-to-all docking test. The docking method and the hybrid scoring function were then used to screen for proteins that bind to tumor necrosis factor-α (TNFα), which is a well-known therapeutic target for rheumatoid arthritis and other autoimmune diseases. A protein library containing 677 proteins was used for the screen. Proteins with scores among the top 20% were further examined. Sixteen proteins from the top-ranking 67 proteins were selected for experimental study. Two of these proteins showed significant binding to TNFα in an in vitro binding study. The results of the present study demonstrate the power and potential application of protein-protein docking for the discovery of novel binding proteins for specific protein targets.

  19. Prediction of Tumor Outcome Based on Gene Expression Data

    Institute of Scientific and Technical Information of China (English)

    Liu Juan; Hitoshi Iba

    2004-01-01

    Gene expression microarray data can be used to classify tumor types. We proposed a new procedure to classify human tumor samples based on microarray gene expressions by using a hybrid supervised learning method called MOEA+WV (Multi-Objective Evolutionary Algorithm+Weighted Voting). MOEA is used to search for a relatively few subsets of informative genes from the high-dimensional gene space, and WV is used as a classification tool. This new method has been applied to predicate the subtypes of lymphoma and outcomes of medulloblastoma. The results are relatively accurate and meaningful compared to those from other methods.

  20. Genome-scale identification of cell-wall related genes in Arabidopsis based on co-expression network analysis

    Directory of Open Access Journals (Sweden)

    Wang Shan

    2012-08-01

    Full Text Available Abstract Background Identification of the novel genes relevant to plant cell-wall (PCW synthesis represents a highly important and challenging problem. Although substantial efforts have been invested into studying this problem, the vast majority of the PCW related genes remain unknown. Results Here we present a computational study focused on identification of the novel PCW genes in Arabidopsis based on the co-expression analyses of transcriptomic data collected under 351 conditions, using a bi-clustering technique. Our analysis identified 217 highly co-expressed gene clusters (modules under some experimental conditions, each containing at least one gene annotated as PCW related according to the Purdue Cell Wall Gene Families database. These co-expression modules cover 349 known/annotated PCW genes and 2,438 new candidates. For each candidate gene, we annotated the specific PCW synthesis stages in which it is involved and predicted the detailed function. In addition, for the co-expressed genes in each module, we predicted and analyzed their cis regulatory motifs in the promoters using our motif discovery pipeline, providing strong evidence that the genes in each co-expression module are transcriptionally co-regulated. From the all co-expression modules, we infer that 108 modules are related to four major PCW synthesis components, using three complementary methods. Conclusions We believe our approach and data presented here will be useful for further identification and characterization of PCW genes. All the predicted PCW genes, co-expression modules, motifs and their annotations are available at a web-based database: http://csbl.bmb.uga.edu/publications/materials/shanwang/CWRPdb/index.html.

  1. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new