WorldWideScience

Sample records for based gene discovery

  1. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  2. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  3. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery

    Science.gov (United States)

    Moriarity, Branden S; Largaespada, David A

    2016-01-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  4. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  5. Seed-based systematic discovery of specific transcription factor target genes.

    Science.gov (United States)

    Mrowka, Ralf; Blüthgen, Nils; Fähling, Michael

    2008-06-01

    Reliable prediction of specific transcription factor target genes is a major challenge in systems biology and functional genomics. Current sequence-based methods yield many false predictions, due to the short and degenerated DNA-binding motifs. Here, we describe a new systematic genome-wide approach, the seed-distribution-distance method, that searches large-scale genome-wide expression data for genes that are similarly expressed as known targets. This method is used to identify genes that are likely targets, allowing sequence-based methods to focus on a subset of genes, giving rise to fewer false-positive predictions. We show by cross-validation that this method is robust in recovering specific target genes. Furthermore, this method identifies genes with typical functions and binding motifs of the seed. The method is illustrated by predicting novel targets of the transcription factor nuclear factor kappaB (NF-kappaB). Among the new targets is optineurin, which plays a key role in the pathogenesis of acquired blindness caused by adult-onset primary open-angle glaucoma. We show experimentally that the optineurin gene and other predicted genes are targets of NF-kappaB. Thus, our data provide a missing link in the signalling of NF-kappaB and the damping function of optineurin in signalling feedback of NF-kappaB. We present a robust and reliable method to enhance the genome-wide prediction of specific transcription factor target genes that exploits the vast amount of expression information available in public databases today. PMID:18485006

  6. Independent Gene Discovery and Testing

    Science.gov (United States)

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  7. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Directory of Open Access Journals (Sweden)

    Oelofse Dean

    2010-04-01

    Full Text Available Abstract Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L. Walp. We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i to normalize the data effectively using spike-in control spot normalization, and (ii to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped

  8. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  9. High-Throughput, Motility-Based Sorter for Microswimmers and Gene Discovery Platform

    Science.gov (United States)

    Yuan, Jinzhou; Raizen, David; Bau, Haim

    2015-11-01

    Animal motility varies with genotype, disease progression, aging, and environmental conditions. In many studies, it is desirable to carry out high throughput motility-based sorting to isolate rare animals for, among other things, forward genetic screens to identify genetic pathways that regulate phenotypes of interest. Many commonly used screening processes are labor-intensive, lack sensitivity, and require extensive investigator training. Here, we describe a sensitive, high throughput, automated, motility-based method for sorting nematodes. Our method was implemented in a simple microfluidic device capable of sorting many thousands of animals per hour per module, and is amenable to parallelism. The device successfully enriched for known C. elegans motility mutants. Furthermore, using this device, we isolated low-abundance mutants capable of suppressing the somnogenic effects of the flp-13 gene, which regulates sleep-like quiescence in C. elegans. Subsequent genomic sequencing led to the identification of a flp-13-suppressor gene. This research was supported, in part, by NIH NIA Grant 5R03AG042690-02.

  10. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    Science.gov (United States)

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. PMID:27480682

  11. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery

    Directory of Open Access Journals (Sweden)

    Christopher J. Hayes

    2015-06-01

    Full Text Available PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits.

  12. An ensemble method for gene discovery based on DNA microarray data

    Institute of Scientific and Technical Information of China (English)

    LI Xia; RAO Shaoqi; ZHANG Tianwen; GUO Zheng; ZHANG Qingpu; Kathy L. MOSER; Eric J. TOPOL

    2004-01-01

    The advent of DNA microarray technology has offered the promise of casting new insights onto deciphering secrets of life by monitoring activities of thousands of genes simultaneously.Current analyses of microarray data focus on precise classification of biological types,for example,tumor versus normal tissues.A further scientific challenging task is to extract disease-relevant genes from the bewildering amounts of raw data,which is one of the most critical themes in the post-genomic era,but it is generally ignored due to lack of an efficient approach.In this paper,we present a novel ensemble method for gene extraction that can be tailored to fulfill multiple biological tasks including(i)precise classification of biological types;(ii)disease gene mining; and(iii)target-driven gene networking.We also give a numerical application for(i)and(ii)using a public microarrary data set and set aside a separate paper to address(iii).

  13. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling

    Institute of Scientific and Technical Information of China (English)

    Xiaosheng Wang

    2013-01-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer.The cancer informatics approach is a useful supplement to the traditional experimental approach.I reviewed several reports that used a bioinformatics approach to analyze the associations among aging,stem cells,and cancer by microarray gene expression profiling.The high expression of aging-or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging,stem cells,and cancer.These mechanisms are involved in cell cycle regulation,metabolic process,DNA damage response,apoptosis,p53 signaling pathway,immune/inflammatory response,and other processes,suggesting that cancer is a developmental and evolutional disease that is strongly related to aging.Moreover,these mechanisms demonstrate that the initiation,proliferation,and metastasis of cancer are associated with the deregulation of stem cells.These findings provide insights into the biology of cancer.Certainly,the findings that are obtained by the informatics approach should be justified by experimental validation.This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  14. Gene discovery in Triatoma infestans

    Directory of Open Access Journals (Sweden)

    de Burgos Nelia

    2011-03-01

    Full Text Available Abstract Background Triatoma infestans is the most relevant vector of Chagas disease in the southern cone of South America. Since its genome has not yet been studied, sequencing of Expressed Sequence Tags (ESTs is one of the most powerful tools for efficiently identifying large numbers of expressed genes in this insect vector. Results In this work, we generated 826 ESTs, resulting in an increase of 47% in the number of ESTs available for T. infestans. These ESTs were assembled in 471 unique sequences, 151 of which represent 136 new genes for the Reduviidae family. Conclusions Among the putative new genes for the Reduviidae family, we identified and described an interesting subset of genes involved in development and reproduction, which constitute potential targets for insecticide development.

  15. Gene discovery and molecular marker development, based on high-throughput transcript sequencing of Paspalum dilatatum Poir.

    Directory of Open Access Journals (Sweden)

    Andrea Giordano

    Full Text Available BACKGROUND: Paspalum dilatatum Poir. (common name dallisgrass is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. RESULTS: Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs. A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. CONCLUSIONS: The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression

  16. STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

    OpenAIRE

    Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent

    2009-01-01

    Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 fa...

  17. Functional gene-based discovery of phenazines from the actinobacteria associated with marine sponges in the South China Sea.

    Science.gov (United States)

    Karuppiah, Valliappan; Li, Yingxin; Sun, Wei; Feng, Guofang; Li, Zhiyong

    2015-07-01

    Phenazines represent a large group of nitrogen-containing heterocyclic compounds produced by the diverse group of bacteria including actinobacteria. In this study, a total of 197 actinobacterial strains were isolated from seven different marine sponge species in the South China Sea using five different culture media. Eighty-seven morphologically different actinobacterial strains were selected and grouped into 13 genera, including Actinoalloteichus, Kocuria, Micrococcus, Micromonospora, Mycobacterium, Nocardiopsis, Prauserella, Rhodococcus, Saccharopolyspora, Salinispora, Serinicoccus, and Streptomyces by the phylogenetic analysis of 16S rRNA gene. Based on the screening of phzE genes, ten strains, including five Streptomyces, two Nocardiopsis, one Salinispora, one Micrococcus, and one Serinicoccus were found to be potential for phenazine production. The level of phzE gene expression was highly expressed in Nocardiopsis sp. 13-33-15, 13-12-13, and Serinicoccus sp. 13-12-4 on the fifth day of fermentation. Finally, 1,6-dihydroxy phenazine (1) from Nocardiopsis sp. 13-33-15 and 13-12-13, and 1,6-dimethoxy phenazine (2) from Nocardiopsis sp. 13-33-15 were isolated and identified successfully based on ESI-MS and NMR analysis. The compounds 1 and 2 showed antibacterial activity against Bacillus mycoides SJ14, Staphylococcus aureus SJ51, Escherichia coli SJ42, and Micrococcus luteus SJ47. This study suggests that the integrated approach of gene screening and chemical analysis is an effective strategy to find the target compounds and lays the basis for the production of phenazine from the sponge-associated actinobacteria. PMID:25820602

  18. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    Science.gov (United States)

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  19. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  20. Mitigating false-positive associations in rare disease gene discovery.

    Science.gov (United States)

    Akle, Sebastian; Chun, Sung; Jordan, Daniel M; Cassa, Christopher A

    2015-10-01

    Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations. PMID:26378430

  1. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  2. Automated discovery of functional generality of human gene expression programs.

    Directory of Open Access Journals (Sweden)

    Georg K Gerber

    2007-08-01

    Full Text Available An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-kappaB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal

  3. The von Hippel-Lindau Gene: Turning Discovery Into Therapy

    OpenAIRE

    Clark, Peter E.; Cookson, Michael S.

    2008-01-01

    Mutations or aberrations of the von Hippel-Lindau gene are responsible for the hereditary neoplastic syndrome that bears the same name, as well as for the majority of sporadic clear cell renal cell carcinomas. The discovery of this gene and subsequent clarification of its mechanism of action have led to a series of targeted treatments for advanced kidney cancer and have dramatically changed how we manage this disease. The discovery of the VHL gene is a prime example of how discoveries at the ...

  4. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Directory of Open Access Journals (Sweden)

    Theresa A Hill

    Full Text Available The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs. Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP. Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and

  5. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Science.gov (United States)

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  6. DNA Coding Based Knowledge Discovery Algorithm

    Institute of Scientific and Technical Information of China (English)

    LI Ji-yun; GENG Zhao-feng; SHAO Shi-huang

    2002-01-01

    A novel DNA coding based knowledge discovery algorithm was proposed, an example which verified its validity was given. It is proved that this algorithm can discover new simplified rules from the original rule set efficiently.

  7. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  8. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    Directory of Open Access Journals (Sweden)

    G. M. Souza

    2007-12-01

    Full Text Available Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regulatory events controlling responses to herbivory, drought, and phosphate deficiency, which cause important constraints on yield and on endophytic bacteria, which are highly beneficial. The means to reduce drought, phosphate deficiency, and herbivory by the sugarcane borer have a negative impact on the environment. Improved tolerance for these constraints is being sought. Sugarcane's ability to accumulate sucrose up to 16% of its culm dry weight is a challenge for genetic manipulation. Genome-based technology such as cDNA microarray data indicates genes associated with sugar content that may be used to develop new varieties improved for sucrose content or for traits that restrict the expansion of the cultivated land. The genes can also be used as molecular markers of agronomic traits in traditional breeding programs.

  9. Indexer Based Dynamic Web Services Discovery

    CERN Document Server

    Bashir, Saba; Javed, M Younus; Khan, Aihab; Khiyal, Malik Sikandar Hayat

    2010-01-01

    Recent advancement in web services plays an important role in business to business and business to consumer interaction. Discovery mechanism is not only used to find a suitable service but also provides collaboration between service providers and consumers by using standard protocols. A static web service discovery mechanism is not only time consuming but requires continuous human interaction. This paper proposed an efficient dynamic web services discovery mechanism that can locate relevant and updated web services from service registries and repositories with timestamp based on indexing value and categorization for faster and efficient discovery of service. The proposed prototype focuses on quality of service issues and introduces concept of local cache, categorization of services, indexing mechanism, CSP (Constraint Satisfaction Problem) solver, aging and usage of translator. Performance of proposed framework is evaluated by implementing the algorithm and correctness of our method is shown. The results of p...

  10. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    OpenAIRE

    G. M. Souza; M.-A. Van-Sluys; Vincentz, M.; Silva-Filho, M. C.; Menossi, M.

    2007-01-01

    Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST) sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regul...

  11. Novel venom gene discovery in the platypus

    OpenAIRE

    Mitreva, Makedonka; Papenfuss, Antony T.; Whittington, Camilla M; Locke, Devin P.; Mardis, Elaine; Wilson, Richard K.; Abubucker, Sahar; Wong, Emily Sw; Hsu, Artur; Kuchei, Philip W.; Belov, Katherine; Warren, Wesley

    2010-01-01

    Background: To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. Results: We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores)...

  12. Bioinformatics and the discovery of gene function

    OpenAIRE

    Casari, G; Daruvar, Dea; Sander, C.; Schneider, Reinhard

    1996-01-01

    Scientific history was made in completing the yeast genuine sequence, yet its 13 Mb are a mere starting point. Two challenges loom large: to decipher the function of all genes and to describe the workings of the eukaryotic cell in full molecular detail. A combination of experimental and theoretical approaches will be brought to bear on these challenges. What will be next in yeast genome analysis from the point of view of bioinformatics?

  13. Adaptation Knowledge Discovery from a Case Base

    OpenAIRE

    D'Aquin, Mathieu; Badra, Fadi; Lafrogne, Sandrine; Lieber, Jean; Napoli, Amedeo; Szathmary, Laszlo

    2006-01-01

    In case-based reasoning, the adaptation step depends in general on domain-dependent knowledge, which motivates studies on adaptation knowledge acquisition (AKA). CABAMAKA is an AKA system based on principles of knowledge discovery from databases. This system explores the variations within the case base to elicit adaptation knowledge. It has been successfully tested in an application of case-based decision support to breast cancer treatment.

  14. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach

    Science.gov (United States)

    Patel, Sejal; Park, Min Tae M.; Chakravarty, M. Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease. PMID:27092072

  15. Beegle: from literature mining to disease-gene discovery.

    Science.gov (United States)

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/. PMID:26384564

  16. Microarray Assisted Gene Discovery in Ulcerative Colitis

    DEFF Research Database (Denmark)

    Brusgaard, Klaus

    on the activation of different downstream pathways. Thus it seems that different genetic backgrounds can lead to similar clinical manifestations, and as well determines the susceptibility to IBD. In the previous micro array based expression studies on UC the main target has been to point to new...

  17. Ontology Based Qos Driven Web Service Discovery

    Directory of Open Access Journals (Sweden)

    R Suganyakala

    2011-07-01

    Full Text Available In today's scenario web services have become a grand vision to implement the business process functionalities. With increase in number of similar web services, one of the essential challenges is to discover relevant web service with regard to user specification. Relevancy of web service discovery can be improved by augmenting semantics through expressive formats like OWL. QoS based service selection will play a significant role in meeting the non-functional user requirements. Hence QoS and semantics has been used as finer search constraints to discover the most relevant service. In this paper, we describe a QoS framework for ontology based web service discovery. The QoS factors taken into consideration are execution time, response time, throughput, scalability, reputation, accessibility and availability. The behavior of each web service at various instances is observed over a period of time and their QoS based performance is analyzed.

  18. The discovery of the microphthalmia locus and its gene, Mitf

    OpenAIRE

    Arnheiter, Heinz

    2010-01-01

    The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation-induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was f...

  19. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  20. Does Discovery-Based Instruction Enhance Learning?

    OpenAIRE

    Alfieri, L.; Brooks, PJ; Aldrich, NJ; Tenenbaum, HR

    2011-01-01

    Discovery learning approaches to education have recently come under scrutiny (Tobias & Duffy, 2009), with many studies indicating limitations to discovery learning practices. Therefore, 2 meta-analyses were conducted using a sample of 164 studies: The 1st examined the effects of unassisted discovery learning versus explicit instruction, and the 2nd examined the effects of enhanced and/or assisted discovery versus other types of instruction (e.g., explicit, unassisted discovery). Random effect...

  1. Analyzing Interaction of μ-, δ- and κ-opioid Receptor Gene Variants on Alcohol or Drug Dependence Using a Pattern Discovery-based Method

    OpenAIRE

    Li, Zhong; Zhang, Huiping

    2013-01-01

    Background Polymorphisms in the μ-, δ- and κ-opioid receptor genes (OPRM1, OPRD1 and OPRK1) have been reported to be associated with substance (alcohol or drug) dependence. The influence of an individual gene on a disease trait should be more evident when analyzed in the context of gene-gene interactions. Thus, we assessed the joint effect of variants in these three opioid receptor genes on alcohol, cocaine, or opioid dependence. Methods Genotype data for 13 OPRM1 Single Nucleotide Polymorphi...

  2. Does Discovery-Based Instruction Enhance Learning?

    Science.gov (United States)

    Alfieri, Louis; Brooks, Patricia J.; Aldrich, Naomi J.; Tenenbaum, Harriet R.

    2011-01-01

    Discovery learning approaches to education have recently come under scrutiny (Tobias & Duffy, 2009), with many studies indicating limitations to discovery learning practices. Therefore, 2 meta-analyses were conducted using a sample of 164 studies: The 1st examined the effects of unassisted discovery learning versus explicit instruction, and the…

  3. Psychiatric gene discoveries shape evidence on ADHD's biology

    Science.gov (United States)

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10−4) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  4. Psychiatric gene discoveries shape evidence on ADHD's biology.

    Science.gov (United States)

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-09-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10(-4)) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  5. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  6. Database systems for knowledge-based discovery.

    Science.gov (United States)

    Jagarlapudi, Sarma A R P; Kishan, K V Radha

    2009-01-01

    Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery. PMID:19727614

  7. Indexer Based Dynamic Web Services Discovery

    OpenAIRE

    Saba Bashir,; Farhan Hassan Khan; M. Younus Javed; Aihab Khan; Malik Sikandar Hayat Khiyal

    2010-01-01

    Recent advancement in web services plays an important role in business to business and business to consumer interaction. Discovery mechanism is not only used to find a suitable service but also provides collaboration between service providers and consumers by using standard protocols. A static web service discovery mechanism is not only time consuming but requires continuous human interaction. This paper proposed an efficient dynamic web services discovery mechanism that can locate relevant a...

  8. The Matchmaker Exchange: a platform for rare disease gene discovery.

    Science.gov (United States)

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow. PMID:26295439

  9. Amyotrophic Lateral Sclerosis: An Emerging Era of Collaborative Gene Discovery

    Science.gov (United States)

    Gwinn, Katrina; Corriveau, Roderick A.; Mitsumoto, Hiroshi; Bednarz, Kate; Brown, Robert H.; Cudkowicz, Merit; Gordon, Paul H.; Hardy, John; Kasarskis, Edward J.; Kaufmann, Petra; Miller, Robert; Sorenson, Eric; Tandan, Rup; Traynor, Bryan J.; Nash, Josefina; Sherman, Alex; Mailman, Matthew D.; Ostell, James; Bruijn, Lucie; Cwik, Valerie; Rich, Stephen S.; Singleton, Andrew; Refolo, Larry; Andrews, Jaime; Zhang, Ran; Conwit, Robin; Keller, Margaret A.

    2007-01-01

    Amyotrophic lateral sclerosis (ALS) is the most common form of motor neuron disease (MND). It is currently incurable and treatment is largely limited to supportive care. Family history is associated with an increased risk of ALS, and many Mendelian causes have been discovered. However, most forms of the disease are not obviously familial. Recent advances in human genetics have enabled genome-wide analyses of single nucleotide polymorphisms (SNPs) that make it possible to study complex genetic contributions to human disease. Genome-wide SNP analyses require a large sample size and thus depend upon collaborative efforts to collect and manage the biological samples and corresponding data. Public availability of biological samples (such as DNA), phenotypic and genotypic data further enhances research endeavors. Here we discuss a large collaboration among academic investigators, government, and non-government organizations which has created a public repository of human DNA, immortalized cell lines, and clinical data to further gene discovery in ALS. This resource currently maintains samples and associated phenotypic data from 2332 MND subjects and 4692 controls. This resource should facilitate genetic discoveries which we anticipate will ultimately provide a better understanding of the biological mechanisms of neurodegeneration in ALS. PMID:18060051

  10. Allele discovery of ten candidate drought-response genes in Austrian oak using a systematically informatics approach based on 454 amplicon sequencing

    Directory of Open Access Journals (Sweden)

    Homolka Andreas

    2012-04-01

    Full Text Available Abstract Background Rise of temperatures and shortening of available water as result of predicted climate change will impose significant pressure on long-lived forest tree species. Discovering allelic variation present in drought related genes of two Austrian oak species can be the key to understand mechanisms of natural selection and provide forestry with key tools to cope with future challenges. Results In the present study we have used Roche 454 sequencing and developed a bioinformatic pipeline to process multiplexed tagged amplicons in order to identify single nucleotide polymorphisms and allelic sequences of ten candidate genes related to drought/osmotic stress from sessile oak (Quercus robur and sessile oak (Q. petraea individuals. Out of these, eight genes of 336 oak individuals growing in Austria have been detected with a total number of 158 polymorphic sites. Allele numbers ranged from ten to 52 with observed heterozygosity ranging from 0.115 to 0.640. All loci deviated from Hardy-Weinberg equilibrium and linkage disequilibrium was found among six combinations of loci. Conclusions We have characterized 183 alleles of drought related genes from oak species and detected first evidences of natural selection. Beside the potential for marker development, we have created an expandable bioinformatic pipeline for the analysis of next generation sequencing data.

  11. Network-Based Protein Biomarker Discovery Platforms.

    Science.gov (United States)

    Kim, Minhyung; Hwang, Daehee

    2016-03-01

    The advances in mass spectrometry-based proteomics technologies have enabled the generation of global proteome data from tissue or body fluid samples collected from a broad spectrum of human diseases. Comparative proteomic analysis of global proteome data identifies and prioritizes the proteins showing altered abundances, called differentially expressed proteins (DEPs), in disease samples, compared to control samples. Protein biomarker candidates that can serve as indicators of disease states are then selected as key molecules among these proteins. Recently, it has been addressed that cellular pathways can provide better indications of disease states than individual molecules and also network analysis of the DEPs enables effective identification of cellular pathways altered in disease conditions and key molecules representing the altered cellular pathways. Accordingly, a number of network-based approaches to identify disease-related pathways and representative molecules of such pathways have been developed. In this review, we summarize analytical platforms for network-based protein biomarker discovery and key components in the platforms. PMID:27103885

  12. Graph-Based Methods for Discovery Browsing with Semantic Predications

    DEFF Research Database (Denmark)

    Wilkowski, Bartlomiej; Fiszman, Marcelo; Miller, Christopher M; Hristovski, Dimitar; Arabandi, Sivaram; Rosemblat, Graciela; Rindflesch, Thomas C

    2011-01-01

    We present an extension to literature-based discovery that goes beyond making discoveries to a principled way of navigating through selected aspects of some biomedical domain. The method is a type of "discovery browsing" that guides the user through the research literature on a specified phenomen...... illustrated with depressive disorder and focuses on the interaction of inflammation, circadian phenomena, and the neurotransmitter norepinephrine. Insight provided may contribute to enhanced understanding of the pathophysiology, treatment, and prevention of this disorder....

  13. Graph-Based Methods for Discovery Browsing with Semantic Predications

    OpenAIRE

    Wilkowski, Bartlomiej; Fiszman, Marcelo; Miller, Christopher M.; Hristovski, Dimitar; Arabandi, Sivaram; Rosemblat, Graciela; Rindflesch, Thomas C.

    2011-01-01

    We present an extension to literature-based discovery that goes beyond making discoveries to a principled way of navigating through selected aspects of some biomedical domain. The method is a type of “discovery browsing” that guides the user through the research literature on a specified phenomenon. Poorly understood relationships may be explored through novel points of view, and potentially interesting relationships need not be known ahead of time. In a process of “cooperative reciprocity” t...

  14. Ontological Discovery Environment: a system for integrating gene-phenotype associations.

    Science.gov (United States)

    Baker, Erich J; Jay, Jeremy J; Philip, Vivek M; Zhang, Yun; Li, Zuopan; Kirova, Roumyana; Langston, Michael A; Chesler, Elissa J

    2009-12-01

    The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental

  15. Discovery of error-tolerant biclusters from noisy gene expression data

    Directory of Open Access Journals (Sweden)

    Gupta Rohit

    2011-11-01

    Full Text Available Abstract Background An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, which limits its applicability in real-life data sets where the biclusters may be fragmented due to random noise/errors. Moreover, as they only work with binary or boolean attributes, their application on gene-expression data require transforming real-valued attributes to binary attributes, which often results in loss of information. Many past approaches have tried to address the issue of noise and handling real-valued attributes independently but there is no systematic approach that addresses both of these issues together. Results In this paper, we first propose a novel error-tolerant biclustering model, ‘ET-bicluster’, and then propose a bottom-up heuristic-based mining algorithm to sequentially discover error-tolerant biclusters directly from real-valued gene-expression data. The efficacy of our proposed approach is illustrated by comparing it with a recent approach RAP in the context of two biological problems: discovery of functional modules and discovery of biomarkers. For the first problem, two real-valued S.Cerevisiae microarray gene-expression data sets are used to demonstrate

  16. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    Directory of Open Access Journals (Sweden)

    Nadine Norton

    Full Text Available Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988 between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads. Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol

  17. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    Directory of Open Access Journals (Sweden)

    Gasser Robin B

    2007-06-01

    Full Text Available Abstract Background Cholangiocarcinoma (CCA – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum than to free-living (Schmidtea mediterranea flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions

  18. Literature-based knowledge discovery: the state of the art

    CERN Document Server

    Liu, Xiaoyong

    2012-01-01

    Literature-based knowledge discovery method was introduced by Dr. Swanson in 1986. He hypothesized a connection between Raynaud's phenomenon and dietary fish oil, the field of literature-based discovery (LBD) was born from then on. During the subsequent two decades, LBD's research attracts some scientists including information science, computer science, and biomedical science, etc.. It has been a part of knowledge discovery and text mining. This paper summarizes the development of recent years about LBD and presents two parts, methodology research and applied research. Lastly, some problems are pointed as future research directions.

  19. Data mining as a discovery tool for imprinted genes.

    Science.gov (United States)

    Brideau, Chelsea; Soloway, Paul

    2012-01-01

    This chapter serves as an introduction to the collection of genome-wide sequence and epigenomic data, as well as the use of these data in training generalized linear models (glm) to predicted imprinted status. This is meant to be an introduction to the method, so only the most straightforward examples will be covered. For instance, the examples given below refer to 11 classes of genomic regions (the entire gene body, introns, exons, 5' UTR, 3' UTR, and 1, 10, and 100 kb upstream and downstream of each gene). One could also build models based on combinations of these regions. Likewise, models could be built on combinations of epigenetic features, or on combinations of both genomic regions and epigenetic features.This chapter relies heavily on computational methods, including basic programming. However, this chapter is not meant to be an introduction to programming. Throughout the chapter, the reader will be provided with example code in the Perl programming language. PMID:22907493

  20. SPARCoC: a new framework for molecular pattern discovery and cancer gene identification.

    Directory of Open Access Journals (Sweden)

    Shiqian Ma

    Full Text Available It is challenging to cluster cancer patients of a certain histopathological type into molecular subtypes of clinical importance and identify gene signatures directly relevant to the subtypes. Current clustering approaches have inherent limitations, which prevent them from gauging the subtle heterogeneity of the molecular subtypes. In this paper we present a new framework: SPARCoC (Sparse-CoClust, which is based on a novel Common-background and Sparse-foreground Decomposition (CSD model and the Maximum Block Improvement (MBI co-clustering technique. SPARCoC has clear advantages compared with widely-used alternative approaches: hierarchical clustering (Hclust and nonnegative matrix factorization (NMF. We apply SPARCoC to the study of lung adenocarcinoma (ADCA, an extremely heterogeneous histological type, and a significant challenge for molecular subtyping. For testing and verification, we use high quality gene expression profiling data of lung ADCA patients, and identify prognostic gene signatures which could cluster patients into subgroups that are significantly different in their overall survival (with p-values < 0.05. Our results are only based on gene expression profiling data analysis, without incorporating any other feature selection or clinical information; we are able to replicate our findings with completely independent datasets. SPARCoC is broadly applicable to large-scale genomic data to empower pattern discovery and cancer gene identification.

  1. TILLING in forage grasses for gene discovery and breeding improvement.

    Science.gov (United States)

    Manzanares, Chloe; Yates, Steven; Ruckle, Michael; Nay, Michelle; Studer, Bruno

    2016-09-25

    Mutation breeding has a long-standing history and in some major crop species, many of the most important cultivars have their origin in germplasm generated by mutation induction. For almost two decades, methods for TILLING (Targeting Induced Local Lesions IN Genomes) have been established in model plant species such as Arabidopsis (Arabidopsis thaliana L.), enabling the functional analysis of genes. Recent advances in mutation detection by second generation sequencing technology have brought its utility to major crop species. However, it has remained difficult to apply similar approaches in forage and turf grasses, mainly due to their outbreeding nature maintained by an efficient self-incompatibility system. Starting with a description of the extent to which traditional mutagenesis methods have contributed to crop yield increase in the past, this review focuses on technological approaches to implement TILLING-based strategies for the improvement of forage grass breeding through forward and reverse genetics. We present first results from TILLING in allogamous forage grasses for traits such as stress tolerance and evaluate prospects for rapid implementation of beneficial alleles to forage grass breeding. In conclusion, large-scale induced mutation resources, used for forward genetic screens, constitute a valuable tool to increase the genetic diversity for breeding and can be generated with relatively small investments in forage grasses. Furthermore, large libraries of sequenced mutations can be readily established, providing enhanced opportunities to discover mutations in genes controlling traits of agricultural importance and to study gene functions by reverse genetics. PMID:26924175

  2. Resource Discovery in Activity-Based Sensor Networks

    DEFF Research Database (Denmark)

    Bucur, Doina; Bardram, Jakob

    This paper proposes a service discovery protocol for sensor networks that is specifically tailored for use in humancentered pervasive environments. It uses the high-level concept of computational activities (as logical bundles of data and resources) to give sensors in Activity-Based Sensor Networks...... (ABSNs) knowledge about their usage even at the network layer. ABSN redesigns classical network-level service discovery protocols to include and use this logical structuring of the network for a more practically applicable service discovery scheme. Noting that in practical settings activity-based sensor...... patches are localized, ABSN designs a completely distributed, hybrid discovery protocol which is proactive in a neighbourhood zone and reactive outside, tailored so that any query among the sensors of one activity is routed through the network with minimum overhead, guided by the bounds of that activity...

  3. Gene discovery using next-generation pyrosequencing to develop ESTs for Phalaenopsis orchids

    Directory of Open Access Journals (Sweden)

    Fu Chih-Hsiung

    2011-07-01

    Full Text Available Abstract Background Orchids are one of the most diversified angiosperms, but few genomic resources are available for these non-model plants. In addition to the ecological significance, Phalaenopsis has been considered as an economically important floriculture industry worldwide. We aimed to use massively parallel 454 pyrosequencing for a global characterization of the Phalaenopsis transcriptome. Results To maximize sequence diversity, we pooled RNA from 10 samples of different tissues, various developmental stages, and biotic- or abiotic-stressed plants. We obtained 206,960 expressed sequence tags (ESTs with an average read length of 228 bp. These reads were assembled into 8,233 contigs and 34,630 singletons. The unigenes were searched against the NCBI non-redundant (NR protein database. Based on sequence similarity with known proteins, these analyses identified 22,234 different genes (E-value cutoff, e-7. Assembled sequences were annotated with Gene Ontology, Gene Family and Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. Among these annotations, over 780 unigenes encoding putative transcription factors were identified. Conclusion Pyrosequencing was effective in identifying a large set of unigenes from Phalaenopsis. The informative EST dataset we developed constitutes a much-needed resource for discovery of genes involved in various biological processes in Phalaenopsis and other orchid species. These transcribed sequences will narrow the gap between study of model organisms with many genomic resources and species that are important for ecological and evolutionary studies.

  4. Toward the discovery of itemsets with significant variations in gene expression matrices

    OpenAIRE

    Kaytoue-Uberall, Mehdi; Duplessis, Sébastien; Napoli, Amedeo

    2008-01-01

    This paper presents new syntactic constraints for itemset mining in gene expression matrices. Biologists are interested in identifying gene expression profiles which present similar quantitative variation features. A two dimensional gene expression profile representation is introduced and adapted to itemset mining allowing to control gene expression. Syntactic constraints introduce expert knowledge at the beginning of the Knowledge Discovery in Databases process and are used to discover items...

  5. Discovery of signature genes in gastric cancer associated with prognosis.

    Science.gov (United States)

    Zhao, X; Cai, H; Wang, X; Ma, L

    2016-01-01

    Gene expression profiles of gastric cancer (GC) were analyzed with bioinformatics tools to identify signature genes associated with prognosis. Four gene expression data sets (accession number: GSE2685, GSE30727, GSE38932 and GSE26253) were downloaded from Gene Expression Omnibus. Differentially expressed genes (DEGs) were screened out using significance analysis of microarrays (SAM) algorithm. P-value 1 were set as the threshold. A co-expression network was constructed for the GC-related genes with package WGCNA of R. Modules were disclosed with WGCNA algorithm. Survival-related signature genes were screened out via COX single-variable regression.A total of 3210 GC-related genes were identified from the 3 data sets. Significantly enriched GO biological process terms included cell death, cell proliferation, apoptosis, response to hormone and phosphorylation. Pathways like viral carcinogenesis, metabolism, EBV viral infection, and PI3K-AKT signaling pathway were significantly over-represented in the DEGs. A gene co-expression network including 2414 genes was constructed, from which 7 modules were revealed. A total of 17 genes were identified as signature genes, such as DAB2, ALDH2, CD58, CITED2, BNIP3L, SLC43A2, FAU and COL5A1.Many signature genes associated with prognosis of GC were identified in present study, some of which have been implicated in the pathogenesis of GC. These findings could not only improve the knowledge about GC, but also provide clues for clinical treatments. PMID:26774142

  6. Gene discovery in trypanosoma vivax through GSS and comparative genomics

    International Nuclear Information System (INIS)

    Full text: Trypanosoma vivax is a hemoparasite affecting livestock industry in South America and Africa. According to Seidl et al more than 11 million cattle evaluated in more than 3 billion dollars are found in the Pantanal region of Brazil and other lowlands in Bolivia. According to the same authors, if the outbreak reported in Pocone-MT (Center-East of Brazil) had gone untreated, the estimated losses would have exceeded US$140,000 on the seven ranches, $200 million in the Pantanal and $700 million regionwide. Despite the high economic relevance of the disease caused by T. vivax, few researches on its molecular characterization has been made as compared with human trypanosomes as T. brucei spp and T. cruzi. The main reason is the difficulty to grow the parasite into laboratory rodents and 'in vitro'. Very few (West African) strains have been adapted to laboratory rodents. Furthermore, most field isolates cannot be characterized by tools as RAPD, since parasitemias are usually very low making difficult the separation of parasites from animal blood for posterior extraction of parasite DNA. These characteristics have limited the research on T. vivax during the last decades, consequently very few markers have been described for its molecular characterization. A search in Genbank showed that there are only 22 entries for T. vivax confronted with nearly 98289, 38577, 23507 available for T. brucei, T. cruzi and Leishmania, respectively. T. vivax (molecular) biology is also little understood, even considering major differences as mechanical transmission in South America and both cyclical and mechanical transmission in Africa. In a consultation with several experts on genomics, it was emphasized that T. vivax and T. congolense are underepresented species in the molecular parasitology and genomics age, then they should be considered to have their genome sequenced. In order to discovery new markers to be explored in the molecular characterization of T. vivax, we decided to

  7. Discovery of mammalian genes that participate in virus infection

    Directory of Open Access Journals (Sweden)

    Sheng Jinsong

    2004-11-01

    Full Text Available Abstract Background Viruses are obligate intracellular parasites that rely upon the host cell for different steps in their life cycles. The characterization of cellular genes required for virus infection and/or cell killing will be essential for understanding viral life cycles, and may provide cellular targets for new antiviral therapies. Results Candidate genes required for lytic reovirus infection were identified by tagged sequence mutagenesis, a process that permits rapid identification of genes disrupted by gene entrapment. One hundred fifty-one reovirus resistant clones were selected from cell libraries containing 2 × 105 independently disrupted genes, of which 111 contained mutations in previously characterized genes and functionally anonymous transcription units. Collectively, the genes associated with reovirus resistance differed from genes targeted by random gene entrapment in that known mutational hot spots were under represented, and a number of mutations appeared to cluster around specific cellular processes, including: IGF-II expression/signalling, vesicular transport/cytoskeletal trafficking and apoptosis. Notably, several of the genes have been directly implicated in the replication of reovirus and other viruses at different steps in the viral lifecycle. Conclusions Tagged sequence mutagenesis provides a rapid, genome-wide strategy to identify candidate cellular genes required for virus infection. The candidate genes provide a starting point for mechanistic studies of cellular processes that participate in the virus lifecycle and may provide targets for novel anti-viral therapies.

  8. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar;

    2004-01-01

    Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number of...... human genes are functionally well characterized. It is still unclear how many and which human genes respond to estrogen treatment. We propose a simple, economic, yet effective computational method to predict a subclass of estrogen responsive genes. Our method relies on the similarity of ERE frames...... across different promoters in the human genome. Matching ERE frames of a test set of 60 known estrogen responsive genes to the collection of over 18,000 human promoters, we obtained 604 candidate genes. Evaluating our result by comparison with the published microarray data and literature, we found that...

  9. Biomarker discovery in mass spectrometry-based urinary proteomics.

    Science.gov (United States)

    Thomas, Samuel; Hao, Ling; Ricke, William A; Li, Lingjun

    2016-04-01

    Urinary proteomics has become one of the most attractive topics in disease biomarker discovery. MS-based proteomic analysis has advanced continuously and emerged as a prominent tool in the field of clinical bioanalysis. However, only few protein biomarkers have made their way to validation and clinical practice. Biomarker discovery is challenged by many clinical and analytical factors including, but not limited to, the complexity of urine and the wide dynamic range of endogenous proteins in the sample. This article highlights promising technologies and strategies in the MS-based biomarker discovery process, including study design, sample preparation, protein quantification, instrumental platforms, and bioinformatics. Different proteomics approaches are discussed, and progresses in maximizing urinary proteome coverage and standardization are emphasized in this review. MS-based urinary proteomics has great potential in the development of noninvasive diagnostic assays in the future, which will require collaborative efforts between analytical scientists, systems biologists, and clinicians. PMID:26703953

  10. SECURE SERVICE DISCOVERY BASED ON PROBE PACKET MECHANISM FOR MANETS

    Directory of Open Access Journals (Sweden)

    S. Pariselvam

    2015-03-01

    Full Text Available In MANETs, Service discovery process is always considered to be crucial since they do not possess a centralized infrastructure for communication. Moreover, different services available through the network necessitate varying categories. Hence, a need arises for devising a secure probe based service discovery mechanism to reduce the complexity in providing the services to the network users. In this paper, we propose a Secure Service Discovery Based on Probe Packet Mechanism (SSDPPM for identifying the DoS attack in MANETs, which depicts a new approach for estimating the level of trust present in each and every routing path of a mobile ad hoc network by using probe packets. Probing based service discovery mechanisms mainly identifies a mobile node’s genuineness using a test packet called probe that travels the entire network for the sake of computing the degree of trust maintained between the mobile nodes and it’s attributed impact towards the network performance. The performance of SSDPPM is investigated through a wide range of network related parameters like packet delivery, throughput, Control overhead and total overhead using the version ns-2.26 network simulator. This mechanism SSDPPM, improves the performance of the network in an average by 23% and 19% in terms of packet delivery ratio and throughput than the existing service discovery mechanisms available in the literature.

  11. Gene Expression Data Knowledge Discovery using Global and Local Clustering

    OpenAIRE

    H, Swathi.

    2010-01-01

    To understand complex biological systems, the research community has produced huge corpus of gene expression data. A large number of clustering approaches have been proposed for the analysis of gene expression data. However, extracting important biological knowledge is still harder. To address this task, clustering techniques are used. In this paper, hybrid Hierarchical k-Means algorithm is used for clustering and biclustering gene expression data is used. To discover both local and global cl...

  12. PiggyBac Transposon Mutagenesis: A Tool for Cancer Gene Discovery in Mice

    OpenAIRE

    Rad, Roland; Rad, Lena; Wang, Wei; Cadinanos, Juan; Vassiliou, George; Rice, Stephen; Campos, Lia S.; Yusa, Kosuke; Banerjee, Ruby; Li, Meng Amy; de la Rosa, Jorge; Strong, Alexander; Lu, Dong; Ellis, Peter; Conte, Nathalie

    2010-01-01

    Transposons are mobile DNA segments that can disrupt gene function by inserting in or near genes. Here we show that insertional mutagenesis by the PiggyBac transposon can be used for cancer gene discovery in mice. PiggyBac transposition in genetically engineered transposon/transposase mice induced cancers whose type (hematopoietic versus solid) and latency were dependent on the regulatory elements introduced into transposons. Analysis of 63 hematopoietic tumors revealed the unique qualities o...

  13. Marinopyrroles: Unique Drug Discoveries Based on Marine Natural Products.

    Science.gov (United States)

    Li, Rongshi

    2016-01-01

    Natural products provide a successful supply of new chemical entities (NCEs) for drug discovery to treat human diseases. Approximately half of the NCEs are based on natural products and their derivatives. Notably, marine natural products, a largely untapped resource, have contributed to drug discovery and development with eight drugs or cosmeceuticals approved by the U.S. Food and Drug Administration and European Medicines Agency, and ten candidates undergoing clinical trials. Collaborative efforts from drug developers, biologists, organic, medicinal, and natural product chemists have elevated drug discoveries to new levels. These efforts are expected to continue to improve the efficiency of natural product-based drugs. Marinopyrroles are examined here as a case study for potential anticancer and antibiotic agents. PMID:26332654

  14. De novo transcriptome sequencing and discovery of genes related to copper tolerance in Paeonia ostii.

    Science.gov (United States)

    Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun

    2016-01-15

    Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation. PMID:26435192

  15. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    Directory of Open Access Journals (Sweden)

    Broberg Per

    2005-08-01

    Full Text Available Abstract Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information

  16. Fragment-based approach in drug discovery

    Czech Academy of Sciences Publication Activity Database

    Vrzal, Lukáš; Dvořáková, H.; Veverka, Václav

    Brno : Masaryk University Press, 2015 - (Sklenář, V.). s. 888 ISBN 978-80-210-7890-1. [EUROMAR 2015. 05.07.2015-10.07.2015, Praha] Institutional support: RVO:61388963 Keywords : fragment-based drug design * NMR experiments Subject RIV: CE - Biochemistry

  17. Fragment-based approach in drug discovery

    Czech Academy of Sciences Publication Activity Database

    Vrzal, Lukáš; Dvořáková, H.; Veverka, Václav

    Brno : Stuare, 2015 - (Novotný, J.). C5 ISBN 978-80-86441-46-7. [NMR Valtice. Central European NMR Meeting /30./. 19.04.2015-22.04.2015, Valtice] Institutional support: RVO:61388963 Keywords : fragment-based drug design * NMR experiments Subject RIV: CE - Biochemistry

  18. GWAS as a Driver of Gene Discovery in Cardiometabolic Diseases.

    Science.gov (United States)

    Atanasovska, Biljana; Kumar, Vinod; Fu, Jingyuan; Wijmenga, Cisca; Hofker, Marten H

    2015-12-01

    Cardiometabolic diseases represent a common complex disorder with a strong genetic component. Currently, genome-wide association studies (GWAS) have yielded some 755 single-nucleotide polymorphisms (SNPs) encompassing 366 independent loci that may help to decipher the molecular basis of cardiometabolic diseases. Going from a disease SNP to the underlying disease mechanisms is a huge challenge because the associated SNPs rarely disrupt protein function. Many disease SNPs are located in noncoding regions, and therefore attention is now focused on linking genetic SNP variation to effects on gene expression levels. By integrating genetic information with large-scale gene expression data, and with data from epigenetic roadmaps revealing gene regulatory regions, we expect to be able to identify candidate disease genes and the regulatory potential of disease SNPs. PMID:26596674

  19. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  20. Phylogeny based discovery of regulatory elements

    Directory of Open Access Journals (Sweden)

    Cohen Barak A

    2006-05-01

    Full Text Available Abstract Background Algorithms that locate evolutionarily conserved sequences have become powerful tools for finding functional DNA elements, including transcription factor binding sites; however, most methods do not take advantage of an explicit model for the constrained evolution of functional DNA sequences. Results We developed a probabilistic framework that combines an HKY85 model, which assigns probabilities to different base substitutions between species, and weight matrix models of transcription factor binding sites, which describe the probabilities of observing particular nucleotides at specific positions in the binding site. The method incorporates the phylogenies of the species under consideration and takes into account the position specific variation of transcription factor binding sites. Using our framework we assessed the suitability of alignments of genomic sequences from commonly used species as substrates for comparative genomic approaches to regulatory motif finding. We then applied this technique to Saccharomyces cerevisiae and related species by examining all possible six base pair DNA sequences (hexamers and identifying sequences that are conserved in a significant number of promoters. By combining similar conserved hexamers we reconstructed known cis-regulatory motifs and made predictions of previously unidentified motifs. We tested one prediction experimentally, finding it to be a regulatory element involved in the transcriptional response to glucose. Conclusion The experimental validation of a regulatory element prediction missed by other large-scale motif finding studies demonstrates that our approach is a useful addition to the current suite of tools for finding regulatory motifs.

  1. Gene Expression Data Knowledge Discovery using Global and Local Clustering

    CERN Document Server

    H, Swathi

    2010-01-01

    To understand complex biological systems, the research community has produced huge corpus of gene expression data. A large number of clustering approaches have been proposed for the analysis of gene expression data. However, extracting important biological knowledge is still harder. To address this task, clustering techniques are used. In this paper, hybrid Hierarchical k-Means algorithm is used for clustering and biclustering gene expression data is used. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. A validation technique, Figure of Merit is used to determine the quality of clustering results. Appropriate knowledge is mined from the clusters by embedding a BLAST similarity search program into the clustering and biclustering process. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. To determine the quality of clustering results, a validation technique, Figure of Merit is used. Appropriate ...

  2. Transposons for cancer gene discovery: Sleeping Beauty and beyond

    OpenAIRE

    Collier, Lara S.; Largaespada, David A

    2007-01-01

    The use of Sleeping Beauty transposons as somatic mutagens to discover cancer genes in hematopoietic tumors and sarcomas has been documented. Here, we discuss the future of Sleeping Beauty for cancer genetic studies and the potential use of additional transposable elements for somatic mutagenesis.

  3. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Cai-Ping Feng; John Mundy

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions,TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also discussed.

  4. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    DEFF Research Database (Denmark)

    Feng, Cai-ping; Mundy, J.

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions, TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also...

  5. Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation

    Science.gov (United States)

    Coppe, Alessandro; Ferrari, Francesco; Bisognin, Andrea; Danieli, Gian Antonio; Ferrari, Sergio; Bicciato, Silvio; Bortoluzzi, Stefania

    2009-01-01

    Genes co-expressed may be under similar promoter-based and/or position-based regulation. Although data on expression, position and function of human genes are available, their true integration still represents a challenge for computational biology, hampering the identification of regulatory mechanisms. We carried out an integrative analysis of genomic position, functional annotation and promoters of genes expressed in myeloid cells. Promoter analysis was conducted by a novel multi-step method for discovering putative regulatory elements, i.e. over-represented motifs, in a selected set of promoters, as compared with a background model. The combination of transcriptional, structural and functional data allowed the identification of sets of promoters pertaining to groups of genes co-expressed and co-localized in regions of the human genome. The application of motif discovery to 26 groups of genes co-expressed in myeloid cells differentiation and co-localized in the genome showed that there are more over-represented motifs in promoters of co-expressed and co-localized genes than in promoters of simply co-expressed genes (CEG). Motifs, which are similar to the binding sequences of known transcription factors, non-uniformly distributed along promoter sequences and/or occurring in highly co-expressed subset of genes were identified. Co-expressed and co-localized gene sets were grouped in two co-expressed genomic meta-regions, putatively representing functional domains of a high-level expression regulation. PMID:19059999

  6. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  7. Gene discovery for facioscapulohumeral muscular dystrophy by machine learning techniques.

    Science.gov (United States)

    González-Navarro, Félix F; Belanche-Muñoz, Lluís A; Gámez-Moreno, María G; Flores-Ríos, Brenda L; Ibarra-Esquer, Jorge E; López-Morteo, Gabriel A

    2016-04-28

    Facioscapulohumeral muscular dystrophy (FSHD) is a neuromuscular disorder that shows a preference for the facial, shoulder and upper arm muscles. FSHD affects about one in 20-400,000 people, and no effective therapeutic strategies are known to halt disease progression or reverse muscle weakness or atrophy. Many genes may be incorrectly regulated in affected muscle tissue, but the mechanisms responsible for the progressive muscle weakness remain largely unknown. Although machine learning (ML) has made significant inroads in biomedical disciplines such as cancer research, no reports have yet addressed FSHD analysis using ML techniques. This study explores a specific FSHD data set from a ML perspective. We report results showing a very promising small group of genes that clearly separates FSHD samples from healthy samples. In addition to numerical prediction figures, we show data visualizations and biological evidence illustrating the potential usefulness of these results. PMID:26960968

  8. Grouped graphical Granger modeling for gene expression regulatory networks discovery

    OpenAIRE

    Lozano, Aurélie C.; Abe, Naoki; Yan LIU; Rosset, Saharon

    2009-01-01

    We consider the problem of discovering gene regulatory networks from time-series microarray data. Recently, graphical Granger modeling has gained considerable attention as a promising direction for addressing this problem. These methods apply graphical modeling methods on time-series data and invoke the notion of ‘Granger causality’ to make assertions on causality through inference on time-lagged effects. Existing algorithms, however, have neglected an important aspect of the problem—the grou...

  9. Quadratic regression analysis for gene discovery and pattern recognition for non-cyclic short time-course microarray experiments

    Directory of Open Access Journals (Sweden)

    Getchell Thomas V

    2005-04-01

    Full Text Available Abstract Background Cluster analyses are used to analyze microarray time-course data for gene discovery and pattern recognition. However, in general, these methods do not take advantage of the fact that time is a continuous variable, and existing clustering methods often group biologically unrelated genes together. Results We propose a quadratic regression method for identification of differentially expressed genes and classification of genes based on their temporal expression profiles for non-cyclic short time-course microarray data. This method treats time as a continuous variable, therefore preserves actual time information. We applied this method to a microarray time-course study of gene expression at short time intervals following deafferentation of olfactory receptor neurons. Nine regression patterns have been identified and shown to fit gene expression profiles better than k-means clusters. EASE analysis identified over-represented functional groups in each regression pattern and each k-means cluster, which further demonstrated that the regression method provided more biologically meaningful classifications of gene expression profiles than the k-means clustering method. Comparison with Peddada et al.'s order-restricted inference method showed that our method provides a different perspective on the temporal gene profiles. Reliability study indicates that regression patterns have the highest reliabilities. Conclusion Our results demonstrate that the proposed quadratic regression method improves gene discovery and pattern recognition for non-cyclic short time-course microarray data. With a freely accessible Excel macro, investigators can readily apply this method to their microarray data.

  10. Mobility Prediction Based Neighborhood Discovery for Mobile Ad Hoc Networks

    OpenAIRE

    Li, Xu; Mitton, Nathalie; Simplot-Ryl, David

    2010-01-01

    Hello protocol is the basic technique for neighborhood discovery in wireless ad hoc networks. It requires nodes to claim their existence/aliveness by periodic `hello' messages. Central to any hello protocol is the determination of `hello' message transmission rate. No fixed optimal rate exists in the presence of node mobility. The rate should in fact adapt to it, high for high mobility and low for low mobility. In this paper, we propose a novel mobility prediction based hello protocol, named ...

  11. Fragment Based Drug Discovery with Surface Plasmon Resonance Technology

    OpenAIRE

    Nordström, Helena

    2013-01-01

    Fragment based drug discovery (FBDD) has been applied to two protease drug targets, MMP-12 and HIV-1 protease. The primary screening and characterization of hit fragments were performed with surface plasmon resonance -technology. Further evaluation of the interaction was done by inhibition studies and in one case with X-ray crystallography. The focus of the two projects was different. Many MMP inhibitors contain a strong zinc chelating group, hydroxamate, interacting with the catalytic zinc a...

  12. Web Service Description and Discovery Based on Semantic Model

    Institute of Scientific and Technical Information of China (English)

    YANG Xuemei; XU Lizhen; DONG Yisheng; WANG Yongli

    2006-01-01

    A novel semantic model of Web service description and discovery was proposed through an extension for profile model of Web ontology language for services (OWL-S) in this paper.Similarity matching of Web services was implemented through computing weighted summation of semantic similarity value based on specific domain ontology and dynamical satisfy extent evaluation for quality of service (QoS).Experiments show that the provided semantic matching model is efficient.

  13. Fragment approaches in structure-based drug discovery

    International Nuclear Information System (INIS)

    Fragment-based methods are successfully generating novel and selective drug-like inhibitors of protein targets, with a number of groups reporting compounds entering clinical trials. This paper summarizes the key features of the approach as one of the tools in structure-guided drug discovery. There has been considerable interest recently in what is known as 'fragment-based lead discovery'. The novel feature of the approach is to begin with small low-affinity compounds. The main advantage is that a larger potential chemical diversity can be sampled with fewer compounds, which is particularly important for new target classes. The approach relies on careful design of the fragment library, a method that can detect binding of the fragment to the protein target, determination of the structure of the fragment bound to the target, and the conventional use of structural information to guide compound optimization. In this article the methods are reviewed, and experiences in fragment-based discovery of lead series of compounds against kinases such as PDK1 and ATPases such as Hsp90 are discussed. The examples illustrate some of the key benefits and issues of the approach and also provide anecdotal examples of the patterns seen in selectivity and the binding mode of fragments across different protein targets

  14. Cross-pollination of research findings, although uncommon, may accelerate discovery of human disease genes

    Directory of Open Access Journals (Sweden)

    Duda Marlena

    2012-11-01

    Full Text Available Abstract Background Technological leaps in genome sequencing have resulted in a surge in discovery of human disease genes. These discoveries have led to increased clarity on the molecular pathology of disease and have also demonstrated considerable overlap in the genetic roots of human diseases. In light of this large genetic overlap, we tested whether cross-disease research approaches lead to faster, more impactful discoveries. Methods We leveraged several gene-disease association databases to calculate a Mutual Citation Score (MCS for 10,853 pairs of genetically related diseases to measure the frequency of cross-citation between research fields. To assess the importance of cooperative research, we computed an Individual Disease Cooperation Score (ICS and the average publication rate for each disease. Results For all disease pairs with one gene in common, we found that the degree of genetic overlap was a poor predictor of cooperation (r2=0.3198 and that the vast majority of disease pairs (89.56% never cited previous discoveries of the same gene in a different disease, irrespective of the level of genetic similarity between the diseases. A fraction (0.25% of the pairs demonstrated cross-citation in greater than 5% of their published genetic discoveries and 0.037% cross-referenced discoveries more than 10% of the time. We found strong positive correlations between ICS and publication rate (r2=0.7931, and an even stronger correlation between the publication rate and the number of cross-referenced diseases (r2=0.8585. These results suggested that cross-disease research may have the potential to yield novel discoveries at a faster pace than singular disease research. Conclusions Our findings suggest that the frequency of cross-disease study is low despite the high level of genetic similarity among many human diseases, and that collaborative methods may accelerate and increase the impact of new genetic discoveries. Until we have a better

  15. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001; FINAL

    International Nuclear Information System (INIS)

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community

  16. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    Energy Technology Data Exchange (ETDEWEB)

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  17. The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery

    Directory of Open Access Journals (Sweden)

    El-Sayed Najib M

    2010-04-01

    Full Text Available Abstract Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5. Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence

  18. Literature-Based Knowledge Discovery using Natural Language Processing

    Science.gov (United States)

    Hristovski, D.; Friedman, C.; Rindflesch, T. C.; Peterlin, B.

    Literature-based discovery (LBD) is an emerging methodology for uncovering nonovert relationships in the online research literature. Making such relationships explicit supports hypothesis generation and discovery. Currently LBD systems depend exclusively on co-occurrence of words or concepts in target documents, regardless of whether relations actually exist between the words or concepts. We describe a method to enhance LBD through capture of semantic relations from the literature via use of natural language processing (NLP). This paper reports on an application of LBD that combines two NLP systems: BioMedLEE and SemRep, which are coupled with an LBD system called BITOLA. The two NLP systems complement each other to increase the types of information utilized by BITOLA. We also discuss issues associated with combining heterogeneous systems. Initial experiments suggest this approach can uncover new associations that were not possible using previous methods.

  19. Marfan Syndrome and Related Disorders: 25 Years of Gene Discovery.

    Science.gov (United States)

    Verstraeten, Aline; Alaerts, Maaike; Van Laer, Lut; Loeys, Bart

    2016-06-01

    Marfan syndrome (MFS) is a rare, autosomal-dominant, multisystem disorder, presenting with skeletal, ocular, skin, and cardiovascular symptoms. Significant clinical overlap with other systemic connective tissue diseases, including Loeys-Dietz syndrome (LDS), Shprintzen-Goldberg syndrome (SGS), and the MASS phenotype, has been documented. In MFS and LDS, the cardiovascular manifestations account for the major cause of patient morbidity and mortality, rendering them the main target for therapeutic intervention. Over the past decades, gene identification studies confidently linked the aforementioned syndromes, as well as nonsyndromic aneurysmal disease, to genetic defects in proteins related to the transforming growth factor (TGF)-β pathway, greatly expanding our knowledge on the disease mechanisms and providing us with novel therapeutic targets. As a result, the focus of the developing pharmacological treatment strategies is shifting from hemodynamic stress management to TGF-β antagonism. In this review, we discuss the insights that have been gained in the molecular biology of MFS and related disorders over the past 25 years. PMID:26919284

  20. Data Mining and Knowledge Discovery via Logic-Based Methods

    CERN Document Server

    Triantaphyllou, Evangelos

    2010-01-01

    There are many approaches to data mining and knowledge discovery (DM&KD), including neural networks, closest neighbor methods, and various statistical methods. This monograph, however, focuses on the development and use of a novel approach, based on mathematical logic, that the author and his research associates have worked on over the last 20 years. The methods presented in the book deal with key DM&KD issues in an intuitive manner and in a natural sequence. Compared to other DM&KD methods, those based on mathematical logic offer a direct and often intuitive approach for extracting easily int

  1. ACFIS: a web server for fragment-based drug discovery.

    Science.gov (United States)

    Hao, Ge-Fei; Jiang, Wen; Ye, Yuan-Nong; Wu, Feng-Xu; Zhu, Xiao-Lei; Guo, Feng-Biao; Yang, Guang-Fu

    2016-07-01

    In order to foster innovation and improve the effectiveness of drug discovery, there is a considerable interest in exploring unknown 'chemical space' to identify new bioactive compounds with novel and diverse scaffolds. Hence, fragment-based drug discovery (FBDD) was developed rapidly due to its advanced expansive search for 'chemical space', which can lead to a higher hit rate and ligand efficiency (LE). However, computational screening of fragments is always hampered by the promiscuous binding model. In this study, we developed a new web server Auto Core Fragment in silico Screening (ACFIS). It includes three computational modules, PARA_GEN, CORE_GEN and CAND_GEN. ACFIS can generate core fragment structure from the active molecule using fragment deconstruction analysis and perform in silico screening by growing fragments to the junction of core fragment structure. An integrated energy calculation rapidly identifies which fragments fit the binding site of a protein. We constructed a simple interface to enable users to view top-ranking molecules in 2D and the binding mode in 3D for further experimental exploration. This makes the ACFIS a highly valuable tool for drug discovery. The ACFIS web server is free and open to all users at http://chemyang.ccnu.edu.cn/ccb/server/ACFIS/. PMID:27150808

  2. Tales of one gene discovery of a novel candidate receptor in mammalian taste

    OpenAIRE

    Huang, Angela Lilly

    2007-01-01

    There are five basic taste modalities in mammals: bitter, sweet, sour, salty, and Umami (taste of MSG and L-amino acids). Receptors for bitter, sweet, and Umami were previously discovered. Identities of receptors for salty and sour taste modalities remained elusive. In this dissertation, I will present: 1) development of a novel bioinformatics screen to discover candidate receptors; 2) discovery of a novel gene, PKD2L1, in taste receptor cells; 3) evidence demonstrating PKD2L1-expressing tast...

  3. Melody-based knowledge discovery in musical pieces

    Science.gov (United States)

    Rybnik, Mariusz; Jastrzebska, Agnieszka

    2016-06-01

    The paper is focused on automated knowledge discovery in musical pieces, based on transformations of digital musical notation. Usually a single musical piece is analyzed, to discover the structure as well as traits of separate voices. Melody and rhythm is processed with the use of three proposed operators, that serve as meta-data. In this work we focus on melody, so the processed data is labeled using fuzzy labels, created for detecting various voice characteristics. A comparative analysis of two musical pieces may be performed as well, that compares them in terms of various rhythmic or melodic traits (as a whole or with voice separation).

  4. Theme discovery from gene lists for identification and viewing of multiple functional groups

    Directory of Open Access Journals (Sweden)

    Wong Garry

    2005-06-01

    Full Text Available Abstract Background High throughput methods of the genome era produce vast amounts of data in the form of gene lists. These lists are large and difficult to interpret without advanced computational or bioinformatic tools. Most existing methods analyse a gene list as a single entity although it is comprised of multiple gene groups associated with separate biological functions. Therefore it is imperative to define and visualize gene groups with unique functionality within gene lists. Results In order to analyse the functional heterogeneity within a gene list, we have developed a method that clusters genes to groups with homogenous functionalities. The method uses Non-negative Matrix Factorization (NMF to create several clustering results with varying numbers of clusters. The obtained clustering results are combined into a simple graphical presentation showing the functional groups over-represented in the analyzed gene list. We demonstrate its performance on two data sets and show results that improve upon existing methods. The comparison also shows that our method creates a more simplified view that aids in discovery of biological themes within the list and discards less informative classes from the results. Conclusion The presented method and associated software are useful for the identification and interpretation of biological functions associated with gene lists and are especially useful for the analysis of large lists.

  5. Mutagenesis as a Functional Genomics Platform for Pharmaceutical Alkaloid Biosynthetic Gene Discovery in Opium Poppy

    International Nuclear Information System (INIS)

    Opium poppy (Papaver somniferum) accumulates the analgesic benzyl-isoquinoline alkaloids morphine, codeine and thebaine, and remains one of the world's most important medicinal plants. The development of varieties that accumulate valuable compounds, such as thebaine and codeine, but not morphine precludes the illicit synthesis of heroin (O,O-diacetylmorphine) and has led to the establishment of alternative cash crops. Novel cDNAs encoding a growing number of biosynthetic enzymes have been isolated, and various -omics resources including EST databases and DNA microarray chips have been established. However, the full potential of functional genomics as a tool for gene discovery in opium poppy remains limited by the relative inefficiency of genetic transformation protocols, which also restricts the application of metabolic engineering for both experimental and commercial purposes. We are establishing an effective functional genomics initiative based on induced mutagenesis and recently developed reverse genetics methodology, such as TILLING (Targeting Induced Local Lesions IN Genomes), with the aim of identifying biosynthetic genes that can be used to engineer opium poppy for the production of copious levels of high-value pharmaceutical alkaloids. Mutagenesis involves the treatment of seeds with ethyl methane sulfonate (EMS) or by fast-neutron bombardment (FNB). In preliminary experiments with EMS-treated seeds, the screening of 1,250 independent M2 plants led to the isolation of four mutants that displayed two distinctly altered alkaloid profiles. Two lines accumulated the central pathway intermediate reticuline and relatively low levels of morphine, codeine and thebaine compared to wild-type plants. Two other lines showed the unusual accumulation in the latex of the antimicrobial alkaloid sanguinarine, which is the product of a branch pathway distinct from that leading to morphine. The present status of -omics resources and functional genomics platforms available to

  6. Mutagenesis as a functional genomics platform for pharmaceutical alkaloid biosynthetic gene discovery in opium poppy

    International Nuclear Information System (INIS)

    Opium poppy (Papaver somniferum) accumulates the analgesic alkaloids morphine, codeine and thebaine, and remains one of the world's most important medicinal plants. The development of varieties that accumulate valuable compounds, such as thebaine and codeine, but not morphine precludes the illicit synthesis of heroin (O,O-diacetylmorphine) and has created opportunities to establish alternative cash crops. Novel cDNAs encoding more than a dozen biosynthetic enzymes have been isolated, and substantial EST databases and DNA microarray chips have been established. The full potential of functional genomics as a tool for gene discovery in opium poppy remains limited by the relative inefficiency of genetic transformation protocols, which also restricts the application of metabolic engineering for both experimental and commercial purposes. We are establishing an effective functional genomics initiative based on induced mutagenesis and TILLING (Targeting Induced Local Lesions IN Genomes) and with the aim of identifying biosynthetic genes that can be used to engineer opium poppy to produce copious levels of high-value pharmaceutical alkaloids. Mutagenesis involves the treatment of seeds by fast-neutron bombardment (FNB) or with ethyl methane sulfonate (EMS). Mutagenized opium poppy plants are cultivated in a secure underground growth facility in partnership with a Canadian biotechnology company. In preliminary experiments with EMS-treated seeds, the screening of 1,250 independent M2 plants led to the isolation of four mutants that displayed two distinctly altered alkaloid profiles. Two lines accumulated the central pathway intermediate (S)- reticuline and only low levels of morphine, codeine and thebaine. Two other lines showed the unusual accumulation of the antimicrobial alkaloid sanguinarine, which is the product of a branch pathway distinct from that leading to morphine, in the latex. The present status of -omics resources and functional genomics platforms available to

  7. Transcriptome profiling for discovery of genes involved in shoot apical meristem and flower development

    Directory of Open Access Journals (Sweden)

    Vikash K. Singh

    2014-12-01

    Full Text Available Flower development is one of the major developmental processes that governs seed setting in angiosperms. However, little is known about the molecular mechanisms underlying flower development in legumes. Employing RNA-seq for various stages of flower development and few vegetative tissues in chickpea, we identified differentially expressed genes in flower tissues/stages in comparison to vegetative tissues, which are related to various biological processes and molecular functions during flower development. Here, we provide details of experimental methods, RNA-seq data (available at Gene Expression Omnibus database under GSE42679 and analysis pipeline published by Singh and colleagues in the Plant Biotechnology Journal (Singh et al., 2013, along with additional analysis for discovery of genes involved in shoot apical meristem (SAM development. Our data provide a resource for exploring the complex molecular mechanisms underlying SAM and flower development and identification of gene targets for functional and applied genomics in legumes.

  8. A Wavelet-Based Approach to Pattern Discovery in Melodies

    DEFF Research Database (Denmark)

    Velarde, Gissel; Meredith, David; Weyde, Tillman

    2016-01-01

    We present a computational method for pattern discovery based on the application of the wavelet transform to symbolic representations of melodies or monophonic voices. We model the importance of a discovered pattern in terms of the compression ratio that can be achieved by using it to describe that...... part of the melody covered by its occurrences. The proposed method resembles that of paradigmatic analysis developed by Ruwet (1966) and Nattiez (1975). In our approach, melodies are represented either as ‘raw’ 1-dimensional pitch signals or as these signals filtered with the continuous wavelet...... transform (CWT) at a single scale using the Haar wavelet. These representations are segmented using various approaches and the segments are then concatenated based on their similarity. The concatenated segments are compared, clustered and ranked. The method was evaluated on two musicological tasks...

  9. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  10. Parallel Density-Based Clustering for Discovery of Ionospheric Phenomena

    Science.gov (United States)

    Pankratius, V.; Gowanlock, M.; Blair, D. M.

    2015-12-01

    Ionospheric total electron content maps derived from global networks of dual-frequency GPS receivers can reveal a plethora of ionospheric features in real-time and are key to space weather studies and natural hazard monitoring. However, growing data volumes from expanding sensor networks are making manual exploratory studies challenging. As the community is heading towards Big Data ionospheric science, automation and Computer-Aided Discovery become indispensable tools for scientists. One problem of machine learning methods is that they require domain-specific adaptations in order to be effective and useful for scientists. Addressing this problem, our Computer-Aided Discovery approach allows scientists to express various physical models as well as perturbation ranges for parameters. The search space is explored through an automated system and parallel processing of batched workloads, which finds corresponding matches and similarities in empirical data. We discuss density-based clustering as a particular method we employ in this process. Specifically, we adapt Density-Based Spatial Clustering of Applications with Noise (DBSCAN). This algorithm groups geospatial data points based on density. Clusters of points can be of arbitrary shape, and the number of clusters is not predetermined by the algorithm; only two input parameters need to be specified: (1) a distance threshold, (2) a minimum number of points within that threshold. We discuss an implementation of DBSCAN for batched workloads that is amenable to parallelization on manycore architectures such as Intel's Xeon Phi accelerator with 60+ general-purpose cores. This manycore parallelization can cluster large volumes of ionospheric total electronic content data quickly. Potential applications for cluster detection include the visualization, tracing, and examination of traveling ionospheric disturbances or other propagating phenomena. Acknowledgments. We acknowledge support from NSF ACI-1442997 (PI V. Pankratius).

  11. Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems

    Directory of Open Access Journals (Sweden)

    Chung I-Fang

    2008-10-01

    Full Text Available Abstract Background The Signal-to-Noise-Ratio (SNR is often used for identification of biomarkers for two-class problems and no formal and useful generalization of SNR is available for multiclass problems. We propose innovative generalizations of SNR for multiclass cancer discrimination through introduction of two indices, Gene Dominant Index and Gene Dormant Index (GDIs. These two indices lead to the concepts of dominant and dormant genes with biological significance. We use these indices to develop methodologies for discovery of dominant and dormant biomarkers with interesting biological significance. The dominancy and dormancy of the identified biomarkers and their excellent discriminating power are also demonstrated pictorially using the scatterplot of individual gene and 2-D Sammon's projection of the selected set of genes. Using information from the literature we have shown that the GDI based method can identify dominant and dormant genes that play significant roles in cancer biology. These biomarkers are also used to design diagnostic prediction systems. Results and discussion To evaluate the effectiveness of the GDIs, we have used four multiclass cancer data sets (Small Round Blue Cell Tumors, Leukemia, Central Nervous System Tumors, and Lung Cancer. For each data set we demonstrate that the new indices can find biologically meaningful genes that can act as biomarkers. We then use six machine learning tools, Nearest Neighbor Classifier (NNC, Nearest Mean Classifier (NMC, Support Vector Machine (SVM classifier with linear kernel, and SVM classifier with Gaussian kernel, where both SVMs are used in conjunction with one-vs-all (OVA and one-vs-one (OVO strategies. We found GDIs to be very effective in identifying biomarkers with strong class specific signatures. With all six tools and for all data sets we could achieve better or comparable prediction accuracies usually with fewer marker genes than results reported in the literature using the

  12. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  13. Synthetic time series resembling human (HeLa) cell-cycle gene expression data and application to gene regulatory network discovery

    OpenAIRE

    Tam, GHF; Hung, YS; Chang, C.

    2013-01-01

    Evaluation of gene regulatory network (GRN) discovery methods relies heavily on synthetic time series. However, synthetic data generated by traditional method deviate a lot from real data, making such evaluation questionable. Guiding by decaying sinusoids, we propose a new method that generates synthetic data resembling human (HeLa) cell-cycle gene expression data. Using the new synthetic data, a simple comparison between four GRN discovery methods reveals that Granger causality (GC) methods ...

  14. MAGIC Database and Interfaces: An Integrated Package for Gene Discovery and Expression

    Directory of Open Access Journals (Sweden)

    Lee H. Pratt

    2006-03-01

    Full Text Available The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs, and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  15. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  16. An improved procedure for gene selection from microarray experiments using false discovery rate criterion

    Directory of Open Access Journals (Sweden)

    Yang Mark CK

    2006-01-01

    Full Text Available Abstract Background A large number of genes usually show differential expressions in a microarray experiment with two types of tissues, and the p-values of a proper statistical test are often used to quantify the significance of these differences. The genes with small p-values are then picked as the genes responsible for the differences in the tissue RNA expressions. One key question is what should be the threshold to consider the p-values small. There is always a trade off between this threshold and the rate of false claims. Recent statistical literature shows that the false discovery rate (FDR criterion is a powerful and reasonable criterion to pick those genes with differential expression. Moreover, the power of detection can be increased by knowing the number of non-differential expression genes. While this number is unknown in practice, there are methods to estimate it from data. The purpose of this paper is to present a new method of estimating this number and use it for the FDR procedure construction. Results A combination of test functions is used to estimate the number of differentially expressed genes. Simulation study shows that the proposed method has a higher power to detect these genes than other existing methods, while still keeping the FDR under control. The improvement can be substantial if the proportion of true differentially expressed genes is large. This procedure has also been tested with good results using a real dataset. Conclusion For a given expected FDR, the method proposed in this paper has better power to pick genes that show differentiation in their expression than two other well known methods.

  17. Cancer Biomarker Discovery: Lectin-Based Strategies Targeting Glycoproteins

    Directory of Open Access Journals (Sweden)

    David Clark

    2012-01-01

    Full Text Available Biomarker discovery can identify molecular markers in various cancers that can be used for detection, screening, diagnosis, and monitoring of disease progression. Lectin-affinity is a technique that can be used for the enrichment of glycoproteins from a complex sample, facilitating the discovery of novel cancer biomarkers associated with a disease state.

  18. PiggyBac transposon mutagenesis: a tool for cancer gene discovery in mice.

    Science.gov (United States)

    Rad, Roland; Rad, Lena; Wang, Wei; Cadinanos, Juan; Vassiliou, George; Rice, Stephen; Campos, Lia S; Yusa, Kosuke; Banerjee, Ruby; Li, Meng Amy; de la Rosa, Jorge; Strong, Alexander; Lu, Dong; Ellis, Peter; Conte, Nathalie; Yang, Fang Tang; Liu, Pentao; Bradley, Allan

    2010-11-19

    Transposons are mobile DNA segments that can disrupt gene function by inserting in or near genes. Here, we show that insertional mutagenesis by the PiggyBac transposon can be used for cancer gene discovery in mice. PiggyBac transposition in genetically engineered transposon-transposase mice induced cancers whose type (hematopoietic versus solid) and latency were dependent on the regulatory elements introduced into transposons. Analysis of 63 hematopoietic tumors revealed that PiggyBac is capable of genome-wide mutagenesis. The PiggyBac screen uncovered many cancer genes not identified in previous retroviral or Sleeping Beauty transposon screens, including Spic, which encodes a PU.1-related transcription factor, and Hdac7, a histone deacetylase gene. PiggyBac and Sleeping Beauty have different integration preferences. To maximize the utility of the tool, we engineered 21 mouse lines to be compatible with both transposon systems in constitutive, tissue- or temporal-specific mutagenesis. Mice with different transposon types, copy numbers, and chromosomal locations support wide applicability. PMID:20947725

  19. Proxy-Based IPv6 Neighbor Discovery Scheme for Wireless LAN Based Mesh Networks

    Science.gov (United States)

    Lee, Jihoon; Jeon, Seungwoo; Kim, Jaehoon

    Multi-hop Wireless LAN-based mesh network (WMN) provides high capacity and self-configuring capabilities. Due to data forwarding and path selection based on MAC address, WMN requires additional operations to achieve global connectivity using IPv6 address. The neighbor discovery operation over WLAN mesh networks requires repeated all-node broadcasting and this gives rise to a big burden in the entire mesh networks. In this letter, we propose the proxy neighbor discovery scheme for optimized IPv6 communication over WMN to reduce network overhead and communication latency. Using simulation experiments, we show that the control overhead and communication setup latency can be significantly reduced using the proxy-based neighbor discovery mechanism.

  20. TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

  1. How might we increase success in marine-based drug discovery?

    Science.gov (United States)

    Desbois, Andrew P

    2014-09-01

    Drug discovery from marine organisms has been underway for > 60 years and there have been notable successes in discovering, developing and introducing clinical agents derived from marine sources. Such examples include: the analgesic ziconotide and the anti cancer compound trabectedin. However, in light of the pressing need for new drugs, particularly those with anti-infective and anticancer properties, there is strong justification for increased exploration of marine organisms as sources of novel compounds. This article considers approaches that might enhance our chances of delivering new medicines from marine-based drug discovery efforts. Consideration is given to the organisms and habitats deserving of more attention and how we might make best use of these marine genetic resources. In particular, the opportunities offered by synthetic biology are highlighted because these methods allow drug discoverers to explore pathways in 'non-culturable' species and turn on natural product biosynthesis genes that are difficult to activate under laboratory conditions (so-called 'silent' gene clusters). PMID:24909595

  2. Intelligent Agent Based Model for Auction Service Discovery in Mobile E-Commerce

    OpenAIRE

    Nandini S Sidnal; Manvi, Sunilkumar S

    2012-01-01

    Internet enabled auctions are one of the popular application which basically require a web service discovery mechanism that is efficient in all perspectives. This paper focuses on auction service discovery and building repository of services for the use of E-customers. The auction service directory (repository) is developed based on the customer’s desires. Agent based Belief Desire Intention (BDI) architecture is used in this model, not only to support the service discovery process in spott...

  3. A new evaluation methodology for literature-based discovery systems.

    Science.gov (United States)

    Yetisgen-Yildiz, Meliha; Pratt, Wanda

    2009-08-01

    While medical researchers formulate new hypotheses to test, they need to identify connections to their work from other parts of the medical literature. However, the current volume of information has become a great barrier for this task. Recently, many literature-based discovery (LBD) systems have been developed to help researchers identify new knowledge that bridges gaps across distinct sections of the medical literature. Each LBD system uses different methods for mining the connections from text and ranking the identified connections, but none of the currently available LBD evaluation approaches can be used to compare the effectiveness of these methods. In this paper, we present an evaluation methodology for LBD systems that allows comparisons across different systems. We demonstrate the abilities of our evaluation methodology by using it to compare the performance of different correlation-mining and ranking approaches used by existing LBD systems. This evaluation methodology should help other researchers compare approaches, make informed algorithm choices, and ultimately help to improve the performance of LBD systems overall. PMID:19124086

  4. Neural network-based QSAR and insecticide discovery: spinetoram.

    Science.gov (United States)

    Sparks, Thomas C; Crouse, Gary D; Dripps, James E; Anzeveno, Peter; Martynow, Jacek; Deamicis, Carl V; Gifford, James

    2008-01-01

    Improvements in the efficacy and spectrum of the spinosyns, novel fermentation derived insecticide, has long been a goal within Dow AgroSciences. As large and complex fermentation products identifying specific modifications to the spinosyns likely to result in improved activity was a difficult process, since most modifications decreased the activity. A variety of approaches were investigated to identify new synthetic directions for the spinosyn chemistry including several explorations of the quantitative structure activity relationships (QSAR) of spinosyns, which initially were unsuccessful. However, application of artificial neural networks (ANN) to the spinosyn QSAR problem identified new directions for improved activity in the chemistry, which subsequent synthesis and testing confirmed. The ANN-based analogs coupled with other information on substitution effects resulting from spinosyn structure activity relationships lead to the discovery of spinetoram (XDE-175). Launched in late 2007, spinetoram provides both improved efficacy and an expanded spectrum while maintaining the exceptional environmental and toxicological profile already established for the spinosyn chemistry. PMID:18344004

  5. Metabolomics-based discovery of diagnostic biomarkers for onchocerciasis.

    Directory of Open Access Journals (Sweden)

    Judith R Denery

    Full Text Available BACKGROUND: Development of robust, sensitive, and reproducible diagnostic tests for understanding the epidemiology of neglected tropical diseases is an integral aspect of the success of worldwide control and elimination programs. In the treatment of onchocerciasis, clinical diagnostics that can function in an elimination scenario are non-existent and desperately needed. Due to its sensitivity and quantitative reproducibility, liquid chromatography-mass spectrometry (LC-MS based metabolomics is a powerful approach to this problem. METHODOLOGY/PRINCIPAL FINDINGS: Analysis of an African sample set comprised of 73 serum and plasma samples revealed a set of 14 biomarkers that showed excellent discrimination between Onchocerca volvulus-positive and negative individuals by multivariate statistical analysis. Application of this biomarker set to an additional sample set from onchocerciasis endemic areas where long-term ivermectin treatment has been successful revealed that the biomarker set may also distinguish individuals with worms of compromised viability from those with active infection. Machine learning extended the utility of the biomarker set from a complex multivariate analysis to a binary format applicable for adaptation to a field-based diagnostic, validating the use of complex data mining tools applied to infectious disease biomarker discovery and diagnostic development. CONCLUSIONS/SIGNIFICANCE: An LC-MS metabolomics-based diagnostic has the potential to monitor the progression of onchocerciasis in both endemic and non-endemic geographic areas, as well as provide an essential tool to multinational programs in the ongoing fight against this neglected tropical disease. Ultimately this technology can be expanded for the diagnosis of other filarial and/or neglected tropical diseases.

  6. GalenOWL: Ontology-based drug recommendations discovery

    Directory of Open Access Journals (Sweden)

    Doulaverakis Charalampos

    2012-12-01

    Full Text Available Abstract Background Identification of drug-drug and drug-diseases interactions can pose a difficult problem to cope with, as the increasingly large number of available drugs coupled with the ongoing research activities in the pharmaceutical domain, make the task of discovering relevant information difficult. Although international standards, such as the ICD-10 classification and the UNII registration, have been developed in order to enable efficient knowledge sharing, medical staff needs to be constantly updated in order to effectively discover drug interactions before prescription. The use of Semantic Web technologies has been proposed in earlier works, in order to tackle this problem. Results This work presents a semantic-enabled online service, named GalenOWL, capable of offering real time drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standards such as the aforementioned ICD-10 and UNII, provide the backbone of the common representation of medical data, while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. A comparison of the developed ontology-based system with a similar system developed using a traditional business logic rule engine is performed, giving insights on the advantages and drawbacks of both implementations. Conclusions The use of Semantic Web technologies has been found to be a good match for developing drug recommendation systems. Ontologies can effectively encapsulate medical knowledge and rule-based reasoning can capture and encode the drug interactions knowledge.

  7. INTELLIGENT SEARCH ENGINE-BASED UNIVERSAL DESCRIPTION, DISCOVERY AND INTEGRATION FOR WEB SERVICE DISCOVERY

    Directory of Open Access Journals (Sweden)

    Tamilarasi Karuppiah

    2014-01-01

    Full Text Available Web Services standard has been broadly acknowledged by industries and academic researches along with the progress of web technology and e-business. Increasing number of web applications have been bundled as web services that can be published, positioned and invoked across the web. The importance of the issues regarding their publication and innovation attains a maximum as web services multiply and become more advanced and mutually dependent. With the intension of determining the web services through effiective manner with in the minimum time period in this study proposes an UDDI with intelligent serach engine. In order to publishing and discovering web services initially, the web services are published in the UDDI registry subsequently the published web services are indexed. To improve the efficiency of discovery of web services, the indexed web services are saved as index database. The search query is compared with the index database for discovering of web services and the discovered web services are given to the service customer. The way of accessing the web services is stored in a log file, which is then utilized to provide personalized web services to the user. The finding of web service is enhanced significantly by means of an efficient exploring capability provided by the proposed system and it is accomplished of providing the maximum appropriate web service. Universal Description, Discovery and Integration (UDDI.

  8. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy

    OpenAIRE

    Meir, Yaa-Jyuhn J; Weirauch, Matthew T.; Yang, Herng-Shing; Chung, Pei-Cheng; Yu, Robert K.; Wu, Sareina C-Y

    2011-01-01

    Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results ...

  9. Discovery of core biotic stress responsive genes in Arabidopsis by weighted gene co-expression network analysis.

    Science.gov (United States)

    Amrine, Katherine C H; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  10. Discovery of core biotic stress responsive genes in Arabidopsis by weighted gene co-expression network analysis.

    Directory of Open Access Journals (Sweden)

    Katherine C H Amrine

    Full Text Available Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups

  11. The Increasing Importance of Gene-Based Analyses

    Science.gov (United States)

    Cirulli, Elizabeth T.

    2016-01-01

    In recent years, genome and exome sequencing studies have implicated a plethora of new disease genes with rare causal variants. Here, I review 150 exome sequencing studies that claim to have discovered that a disease can be caused by different rare variants in the same gene, and I determine whether their methods followed the current best-practice guidelines in the interpretation of their data. Specifically, I assess whether studies appropriately assess controls for rare variants throughout the entire gene or implicated region as opposed to only investigating the specific rare variants identified in the cases, and I assess whether studies present sufficient co-segregation data for statistically significant linkage. I find that the proportion of studies performing gene-based analyses has increased with time, but that even in 2015 fewer than 40% of the reviewed studies used this method, and only 10% presented statistically significant co-segregation data. Furthermore, I find that the genes reported in these papers are explaining a decreasing proportion of cases as the field moves past most of the low-hanging fruit, with 50% of the genes from studies in 2014 and 2015 having variants in fewer than 5% of cases. As more studies focus on genes explaining relatively few cases, the importance of performing appropriate gene-based analyses is increasing. It is becoming increasingly important for journal editors and reviewers to require stringent gene-based evidence to avoid an avalanche of misleading disease gene discovery papers. PMID:27055023

  12. Reconstructing Sessions from Data Discovery and Access Logs to Build a Semantic Knowledge Base for Improving Data Discovery

    Directory of Open Access Journals (Sweden)

    Yongyao Jiang

    2016-04-01

    Full Text Available Big geospatial data are archived and made available through online web discovery and access. However, finding the right data for scientific research and application development is still a challenge. This paper aims to improve the data discovery by mining the user knowledge from log files. Specifically, user web session reconstruction is focused upon in this paper as a critical step for extracting usage patterns. However, reconstructing user sessions from raw web logs has always been difficult, as a session identifier tends to be missing in most data portals. To address this problem, we propose two session identification methods, including time-clustering-based and time-referrer-based methods. We also present the workflow of session reconstruction and discuss the approach of selecting appropriate thresholds for relevant steps in the workflow. The proposed session identification methods and workflow are proven to be able to extract data access patterns for further pattern analyses of user behavior and improvement of data discovery for more relevancy data ranking, suggestion, and navigation.

  13. A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.

    Directory of Open Access Journals (Sweden)

    Srinivasan Ramamurthy

    2009-11-01

    candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species.

  14. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes

    Science.gov (United States)

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-01-01

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored. Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded—repurposed enzyme families—from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy. As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real ‘chemical dark matter’ will be unveiled. PMID:27289100

  15. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes.

    Science.gov (United States)

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-01-01

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored.Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded-repurposed enzyme families-from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy.As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real 'chemical dark matter' will be unveiled. PMID:27289100

  16. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Directory of Open Access Journals (Sweden)

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  17. Pattern Discovery using Fuzzy FP-growth Algorithm from Gene Expression Data

    OpenAIRE

    Sabita Barik; Debahuti Mishra; Shruti Mishra; Sandeep Ku. Satapathy; Amiya Ku. Rath; Milu Acharya

    2010-01-01

    Abstract- The goal of microarray experiments is to identify genes that are differentially transcribed with respect to different biological conditions of cell cultures and samples. Hence, method of data analysis needs to be carefully evaluated such as clustering, classification, prediction etc. In this paper, we have proposed an efficient frequent pattern based clustering to find the gene which forms frequent patterns showing similar phenotypes leading to specific symptoms for specific disease...

  18. ESTs from a wild Arachis species for gene discovery and marker development

    Directory of Open Access Journals (Sweden)

    da Silva Felipe R

    2007-02-01

    Full Text Available Abstract Background Due to its origin, peanut has a very narrow genetic background. Wild relatives can be a source of genetic variability for cultivated peanut. In this study, the transcriptome of the wild species Arachis stenosperma accession V10309 was analyzed. Results ESTs were produced from four cDNA libraries of RNAs extracted from leaves and roots of A. stenosperma. Randomly selected cDNA clones were sequenced to generate 8,785 ESTs, of which 6,264 (71.3% had high quality, with 3,500 clusters: 963 contigs and 2537 singlets. Only 55.9% matched homologous sequences of known genes. ESTs were classified into 23 different categories according to putative protein functions. Numerous sequences related to disease resistance, drought tolerance and human health were identified. Two hundred and six microsatellites were found and markers have been developed for 188 of these. The microsatellite profile was analyzed and compared to other transcribed and genomic sequence data. Conclusion This is, to date, the first report on the analysis of transcriptome of a wild relative of peanut. The ESTs produced in this study are a valuable resource for gene discovery, the characterization of new wild alleles, and for marker development. The ESTs were released in the [GenBank:EH041934 to EH048197].

  19. Wi-Fi Protocol Vulnerability Discovery Based on Fuzzy Testing

    Directory of Open Access Journals (Sweden)

    Kunhua Zhu

    2013-08-01

    Full Text Available To detect the wireless network equipment whether there is protocol vulnerability, using the method of modular design and implementation of a new suitable for Wi-Fi protocol vulnerability discovery fuzzy test framework. It can be independent of its transmission medium, produce deformity packet and implementation of the attack on the target system. The author firstly describes the wireless network protocol vulnerability discovery and fuzzy test in this paper,then focused on the test frame technical scheme, detailed technical realization and so on, and its application are analyzed. In the experimental stage the fuzzy test is applied to a wireless networks gateway, the test results show that the fuzzy test framework can be well applied to the wireless network equipment agreement loophole mining work.  

  20. Mobility Prediction Based Neighborhood Discovery in Mobile Ad Hoc Networks.

    OpenAIRE

    Li, Xu; Mitton, Nathalie; Simplot-Ryl, David

    2011-01-01

    International audience Hello protocol is the basic technique for neighborhood discovery in wireless ad hoc networks. It requires nodes to claim their existence/ aliveness by periodic 'hello' messages. Central to a hello protocol is the determination of 'hello' message transmission rate. No fixed optimal rate exists in the presence of node mobility. The rate should in fact adapt to it, high for high mobility and low for low mobility. In this paper, we propose a novel mobility prediction bas...

  1. Location-based Service Discovery and Delivery in Opportunistic Networks

    OpenAIRE

    Le Sommer, Nicolas; Ben Sassi, Salma

    2010-01-01

    Opportunistic networks are usually formed spontaneously by mobile devices equipped with short range wireless communication interfaces. Designing and implementing a routing protocol to support both service discovery and delivery in such kinds of networks is a challenging problem on account of frequent disconnections and topology changes. In these networks one of the most important issues relies on the selection of the best intermediate node(s) to forward the messages towards their destination(...

  2. Estimation of false discovery rates in multiple testing: application to gene microarray data.

    Science.gov (United States)

    Tsai, Chen-An; Hsueh, Huey-miin; Chen, James J

    2003-12-01

    Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration. PMID:14969487

  3. Topological and functional discovery in a gene coexpression meta-network of gastric cancer.

    Science.gov (United States)

    Aggarwal, Amit; Guo, Dong Li; Hoshida, Yujin; Yuen, Siu Tsan; Chu, Kent-Man; So, Samuel; Boussioutas, Alex; Chen, Xin; Bowtell, David; Aburatani, Hiroyuki; Leung, Suet Yi; Tan, Patrick

    2006-01-01

    Gastric cancer is a leading cause of global cancer mortality, but comparatively little is known about the cellular pathways regulating different aspects of the gastric cancer phenotype. To achieve a better understanding of gastric cancer at the levels of systems topology, functional modules, and constituent genes, we assembled and systematically analyzed a consensus gene coexpression meta-network of gastric cancer incorporating >300 tissue samples from four independent patient populations (the "gastrome"). We find that the gastrome exhibits a hierarchical scale-free architecture, with an internal structure comprising multiple deeply embedded modules associated with diverse cellular functions. Individual modules display distinct subtopologies, with some (cellular proliferation) being integrated within the primary network, and others (ribosomal biosynthesis) being relatively isolated. One module associated with intestinal differentiation exhibited a remarkably high degree of autonomy, raising the possibility that its specific topological features may contribute towards the frequent occurrence of intestinal metaplasia in gastric cancer. At the single-gene level, we discovered a novel conserved interaction between the PLA2G2A prognostic marker and the EphB2 receptor, and used tissue microarrays to validate the PLA2G2A/EphB2 association. Finally, because EphB2 is a known target of the Wnt signaling pathway, we tested and provide evidence that the Wnt pathway may also similarly regulate PLA2G2A. Many of these findings were not discernible by studying the single patient populations in isolation. Thus, besides enhancing our knowledge of gastric cancer, our results show the broad utility of applying meta-analytic approaches to genome-wide data for the purposes of biological discovery. PMID:16397236

  4. Biochemical genomics for gene discovery in benzylisoquinoline alkaloid biosynthesis in opium poppy and related species.

    Science.gov (United States)

    Dang, Thu Thuy T; Onoyovwi, Akpevwe; Farrow, Scott C; Facchini, Peter J

    2012-01-01

    Benzylisoquinoline alkaloids (BIAs) are a large, diverse group of ∼2500 specialized plant metabolites. Many BIAs display potent pharmacological activities, including the narcotic analgesics codeine and morphine, the vasodilator papaverine, the cough suppressant and potential anticancer drug noscapine, the antimicrobial agents sanguinarine and berberine, and the muscle relaxant (+)-tubocurarine. Opium poppy remains the sole commercial source for codeine, morphine, and a variety of semisynthetic drugs, including oxycodone and buprenorphine, derived primarily from the biosynthetic pathway intermediate thebaine. Recent advances in transcriptomics, proteomics, and metabolomics have created unprecedented opportunities for isolating and characterizing novel BIA biosynthetic genes. Here, we describe the application of next-generation sequencing and cDNA microarrays for selecting gene candidates based on comparative transcriptome analysis. We outline the basic mass spectrometric techniques to perform deep proteome and targeted metabolite analyses on BIA-producing plant tissues and provide methodologies for functionally characterizing biosynthetic gene candidates through in vitro enzyme assays and transient gene silencing in planta. PMID:22999177

  5. Mass Spectrometry-Based Biomarker Discovery: Toward a Global Proteome Index of Individuality

    Science.gov (United States)

    Hawkridge, Adam M.; Muddiman, David C.

    2009-07-01

    Biomarker discovery and proteomics have become synonymous with mass spectrometry in recent years. Although this conflation is an injustice to the many essential biomolecular techniques widely used in biomarker-discovery platforms, it underscores the power and potential of contemporary mass spectrometry. Numerous novel and powerful technologies have been developed around mass spectrometry, proteomics, and biomarker discovery over the past 20 years to globally study complex proteomes (e.g., plasma). However, very few large-scale longitudinal studies have been carried out using these platforms to establish the analytical variability relative to true biological variability. The purpose of this review is not to cover exhaustively the applications of mass spectrometry to biomarker discovery, but rather to discuss the analytical methods and strategies that have been developed for mass spectrometry-based biomarker-discovery platforms and to place them in the context of the many challenges and opportunities yet to be addressed.

  6. Natural genetic variation in cassava (Manihot esculenta Crantz) landraces as a tool for gene discovery

    International Nuclear Information System (INIS)

    Cassava landraces are the earliest form of the modern cultivars and represents the first step in cassava domestication. Our forward genetic analysis uses this resource to discover spontaneous mutations in the sucrose/starch and carotenoid synthesis/accumulation and to develop both evolutionary and breeding perspective of gene function related to those traits. Biochemical phenotype variants for the synthesis and accumulation of carotenoid, free sugar and starch were identified. Six subtractive cDNA libraries were prepared to construct a high quality (phred > 20) EST database with 1645 entries. Macroarray analysis was performed to identify differentially expressed gene aiming to identify candidate gene related to sugary phenotype. cDNA sequence for gene coding for specific enzymes in the two pathways were obtained. Gene expression analysis for coding specific enzymes was performed by RNA blot and Real Time PCR analysis. Chromoplastassociated proteins of yellow storage root were fractionated and a peptide sequence data base with 906 entries sequences (MASCOT validated) was constructed. For the sucrose/starch metabolism a sugary class of cassava was identified carrying mutation in the BEI and GBSS mutation. For the pigmented cassava a pink color phenotype showed absence of expression of the gene CasLYB while an intense yellow phenotype showed a down regulation of the gene CasHYb. Heat shock proteins were identified as the major proteins associated with chromoplast. Genetic diversity for the GBSS gene in the natural population identified 22 haplotype and a large nucleotide diversity in four subset of population. Single segregating population derived from F2, half sib and S1 population showed segregation for sugary phenotype (93% of the individuals), waxy phenotype (38% of the individuals) and glycogen like starch (2% of the individuals). Here we summarize our current results for the genetic analysis of this variants and recent progress in the direction of mapping of

  7. PaGenBase: a pattern gene database for the global and dynamic understanding of gene function.

    Directory of Open Access Journals (Sweden)

    Jian-Bo Pan

    Full Text Available Pattern genes are a group of genes that have a modularized expression behavior under serial physiological conditions. The identification of pattern genes will provide a path toward a global and dynamic understanding of gene functions and their roles in particular biological processes or events, such as development and pathogenesis. In this study, we present PaGenBase, a novel repository for the collection of tissue- and time-specific pattern genes, including specific genes, selective genes, housekeeping genes and repressed genes. The PaGenBase database is now freely accessible at http://bioinf.xmu.edu.cn/PaGenBase/. In the current version (PaGenBase 1.0, the database contains 906,599 pattern genes derived from the literature or from data mining of more than 1,145,277 gene expression profiles in 1,062 distinct samples collected from 11 model organisms. Four statistical parameters were used to quantitatively evaluate the pattern genes. Moreover, three methods (quick search, advanced search and browse were designed for rapid and customized data retrieval. The potential applications of PaGenBase are also briefly described. In summary, PaGenBase will serve as a resource for the global and dynamic understanding of gene function and will facilitate high-level investigations in a variety of fields, including the study of development, pathogenesis and novel drug discovery.

  8. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data.

  9. Dispom: a discriminative de-novo motif discovery tool based on the jstacs library.

    Science.gov (United States)

    Grau, Jan; Keilwagen, Jens; Gohr, André; Paponov, Ivan A; Posch, Stefan; Seifert, Michael; Strickert, Marc; Grosse, Ivo

    2013-02-01

    DNA-binding proteins are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in target regions of genomic DNA. However, de-novo discovery of these binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not yet been solved satisfactorily. Here, we present a detailed description and analysis of the de-novo motif discovery tool Dispom, which has been developed for finding binding sites of DNA-binding proteins that are differentially abundant in a set of target regions compared to a set of control regions. Two additional features of Dispom are its capability of modeling positional preferences of binding sites and adjusting the length of the motif in the learning process. Dispom yields an increased prediction accuracy compared to existing tools for de-novo motif discovery, suggesting that the combination of searching for differentially abundant motifs, inferring their positional distributions, and adjusting the motif lengths is beneficial for de-novo motif discovery. When applying Dispom to promoters of auxin-responsive genes and those of ABI3 target genes from Arabidopsis thaliana, we identify relevant binding motifs with pronounced positional distributions. These results suggest that learning motifs, their positional distributions, and their lengths by a discriminative learning principle may aid motif discovery from ChIP-chip and gene expression data. We make Dispom freely available as part of Jstacs, an open-source Java library that is tailored to statistical sequence analysis. To facilitate extensions of Dispom, we describe its implementation using Jstacs in this manuscript. In addition, we provide a stand-alone application of Dispom at http://www.jstacs.de/index.php/Dispom for instant use. PMID:23427988

  10. Human Gene Discovery Laboratory: A Problem-Based Learning Experience

    Science.gov (United States)

    Bonds, Wesley D., Sr.; Paolella, Mary Jane

    2006-01-01

    A single-semester elective combines Mendelian and molecular genetics in a problem-solving format. Students encounter a genetic disease scenario, construct a family pedigree, and try to confirm their medical diagnoses through laboratory experiences. Encouraged to generate ideas as they test their hypotheses, students realize the importance of data…

  11. Paradigm of tunable clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM for gene discovery.

    Directory of Open Access Journals (Sweden)

    Basel Abu-Jamous

    Full Text Available Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM, which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM. The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

  12. Gene invasion in distant eukaryotic lineages: discovery of mutually exclusive genetic elements reveals marine biodiversity.

    Science.gov (United States)

    Monier, Adam; Sudek, Sebastian; Fast, Naomi M; Worden, Alexandra Z

    2013-09-01

    Inteins are rare, translated genetic parasites mainly found in bacteria and archaea, while spliceosomal introns are distinctly eukaryotic features abundant in most nuclear genomes. Using targeted metagenomics, we discovered an intein in an Atlantic population of the photosynthetic eukaryote, Bathycoccus, harbored by the essential spliceosomal protein PRP8 (processing factor 8 protein). Although previously thought exclusive to fungi, we also identified PRP8 inteins in parasitic (Capsaspora) and predatory (Salpingoeca) protists. Most new PRP8 inteins were at novel insertion sites that, surprisingly, were not in the most conserved regions of the gene. Evolutionarily, Dikarya fungal inteins at PRP8 insertion site a appeared more related to the Bathycoccus intein at a unique insertion site, than to other fungal and opisthokont inteins. Strikingly, independent analyses of Pacific and Atlantic samples revealed an intron at the same codon as the Bathycoccus PRP8 intein. The two elements are mutually exclusive and neither was found in cultured Bathycoccus or other picoprasinophyte genomes. Thus, wild Bathycoccus contain one of few non-fungal eukaryotic inteins known and a rare polymorphic intron. Our data indicate at least two Bathycoccus ecotypes exist, associated respectively with oceanic or mesotrophic environments. We hypothesize that intein propagation is facilitated by marine viruses; and, while intron gain is still poorly understood, presence of a spliceosomal intron where a locus lacks an intein raises the possibility of new, intein-primed mechanisms for intron gain. The discovery of nucleus-encoded inteins and associated sequence polymorphisms in uncultivated marine eukaryotes highlights their diversity and reveals potential sexual boundaries between populations indistinguishable by common marker genes. PMID:23635865

  13. Natural and man-made V-gene repertoires for antibody discovery.

    Science.gov (United States)

    Finlay, William J J; Almagro, Juan C

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process. PMID:23162556

  14. Discovery and characterization of nutritionally regulated genes associated with muscle growth in Atlantic salmon.

    Science.gov (United States)

    Bower, Neil I; Johnston, Ian A

    2010-10-01

    A genomics approach was used to identify nutritionally regulated genes involved in growth of fast skeletal muscle in Atlantic salmon (Salmo salar L.). Forward and reverse subtractive cDNA libraries were prepared comparing fish with zero growth rates to fish growing rapidly. We produced 7,420 ESTs and assembled them into nonredundant clusters prior to annotation. Contigs representing 40 potentially unrecognized nutritionally responsive candidate genes were identified. Twenty-three of the subtractive library candidates were also differentially regulated by nutritional state in an independent fasting-refeeding experiment and their expression placed in the context of 26 genes with established roles in muscle growth regulation. The expression of these genes was also determined during the maturation of a primary myocyte culture, identifying 13 candidates from the subtractive cDNA libraries with putative roles in the myogenic program. During early stages of refeeding DNAJA4, HSPA1B, HSP90A, and CHAC1 expression increased, indicating activation of unfolded protein response pathways. Four genes were considered inhibitory to myogenesis based on their in vivo and in vitro expression profiles (CEBPD, ASB2, HSP30, novel transcript GE623928). Other genes showed increased expression with feeding and highest in vitro expression during the proliferative phase of the culture (FOXD1, DRG1) or as cells differentiated (SMYD1, RTN1, MID1IP1, HSP90A, novel transcript GE617747). The genes identified were associated with chromatin modification (SMYD1, RTN1), microtubule stabilization (MID1IP1), cell cycle regulation (FOXD1, CEBPD, DRG1), and negative regulation of signaling (ASB2) and may play a role in the stimulation of myogenesis during the transition from a catabolic to anabolic state in skeletal muscle. PMID:20663983

  15. Location Discovery Based on Fuzzy Geometry in Passive Sensor Networks

    Directory of Open Access Journals (Sweden)

    Rui Wang

    2011-01-01

    Full Text Available Location discovery with uncertainty using passive sensor networks in the nation's power grid is known to be challenging, due to the massive scale and inherent complexity. For bearings-only target localization in passive sensor networks, the approach of fuzzy geometry is introduced to investigate the fuzzy measurability for a moving target in R2 space. The fuzzy analytical bias expressions and the geometrical constraints are derived for bearings-only target localization. The interplay between fuzzy geometry of target localization and the fuzzy estimation bias for the case of fuzzy linear observer trajectory is analyzed in detail in sensor networks, which can realize the 3-dimensional localization including fuzzy estimate position and velocity of the target by measuring the fuzzy azimuth angles at intervals of fixed time. Simulation results show that the resulting estimate position outperforms the traditional least squares approach for localization with uncertainty.

  16. A Middleware Support for Location-Based Service Discovery and Invocation in Disconnected MANETs

    OpenAIRE

    Le Sommer, Nicolas; Ben Sassi, Salma; Guidec, Frédéric; Mahéo, Yves

    2010-01-01

    Disconnected MANETs show a changing topology and a fragmentation in distinct communication islands. In such networks, service discovery and invocation rely preferably on protocols that can support connectivity disruptions such as opportunistic protocols based on the store, carry, and forward principle. In this paper we present a middleware platform for geolocated services in disconnected MANETs. We detail in particular the location methods we used, the facilities for service discovery, select...

  17. Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

    OpenAIRE

    Hur, Junguk; Özgür, Arzucan; Xiang, Zuoshuang; He, Yongqun

    2012-01-01

    Background Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Results Over 170,000 fever-related articles from PubMed abstracts and tit...

  18. De novo Assembly and Characterization of the Transcriptome of Broomcorn Millet (Panicum miliaceum L.) for Gene Discovery and Marker Development.

    Science.gov (United States)

    Yue, Hong; Wang, Le; Liu, Hui; Yue, Wenjie; Du, Xianghong; Song, Weining; Nie, Xiaojun

    2016-01-01

    Broomcorn millet (Panicum miliaceum L.) is one of the world's oldest cultivated cereals, which is well-adapted to extreme environments such as drought, heat, and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No. 2 and Yumi No. 3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO) categories. Cluster of orthologous groups and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs) were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of four selected transcripts were validated through quantitative reverse transcription polymerase chain reaction analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond. PMID

  19. An Integrated Approach to Gene Discovery and Marker Development in Atlantic Cod (Gadus morhua)

    OpenAIRE

    Bowman, Sharen; Hubert, Sophie; Higgins, Brent; Stone, Cynthia; Kimball, Jennifer; Borza, Tudor; Bussey, Jillian Tarrant; Simpson, Gary; Kozera, Catherine; Curtis, Bruce A.; Hall, Jennifer R.; Hori, Tiago S.; Feng, Charles Y.; Rise, Marlies; Booman, Marije

    2010-01-01

    Atlantic cod is a species that has been overexploited by the capture fishery. Programs to domesticate this species are underway in several countries, including Canada, to provide an alternative route for production. Selective breeding programs have been successfully applied in the domestication of other species, with genomics-based approaches used to augment conventional methods of animal production in recent years. Genomics tools, such as gene sequences and sets of variable markers, also hav...

  20. De novo transcriptomic analysis of peripheral blood lymphocytes from the Chinese goose: gene discovery and immune system pathway description.

    Directory of Open Access Journals (Sweden)

    Mansoor Tariq

    Full Text Available The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes.De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs categories. Kyoto Encyclopedia of Genes and Genomes (KEGG database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose.This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful

  1. Estimating the False Discovery Rate Using Mixed Normal Distribution for Identifying Differentially Expressed Genes in Microarray Data Analysis

    Directory of Open Access Journals (Sweden)

    Chikuma Hamada

    2007-01-01

    Full Text Available The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes from many candidate genes. Significance Analysis of Microarray (SAM is often used to estimate the false discovery rate (FDR, which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.

  2. Discovery of tetrahydroisoquinoline-based CXCR4 antagonists.

    Science.gov (United States)

    Truax, Valarie M; Zhao, Huanyu; Katzman, Brooke M; Prosser, Anthony R; Alcaraz, Ana A; Saindane, Manohar T; Howard, Randy B; Culver, Deborah; Arrendale, Richard F; Gruddanti, Prahbakar R; Evers, Taylor J; Natchus, Michael G; Snyder, James P; Liotta, Dennis C; Wilson, Lawrence J

    2013-11-14

    A de novo hit-to-lead effort involving the redesign of benzimidazole-containing antagonists of the CXCR4 receptor resulted in the discovery of a novel series of 1,2,3,4-tetrahydroisoquinoline (TIQ) analogues. In general, this series of compounds show good potencies (3-650 nM) in assays involving CXCR4 function, including both inhibition of attachment of X4 HIV-1IIIB virus in MAGI-CCR5/CXCR4 cells and inhibition of calcium release in Chem-1 cells. Series profiling permitted the identification of TIQ-(R)-stereoisomer 15 as a potent and selective CXCR4 antagonist lead candidate with a promising in vitro profile. The drug-like properties of 15 were determined in ADME in vitro studies, revealing low metabolic liability potential. Further in vivo evaluations included pharmacokinetic experiments in rats and mice, where 15 was shown to have oral bioavailability (F = 63%) and resulted in the mobilization of white blood cells (WBCs) in a dose-dependent manner. PMID:24936240

  3. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Directory of Open Access Journals (Sweden)

    Adam Y Ye

    Full Text Available Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD (http://htd.cbi.pku.edu.cn. Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  4. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

    Directory of Open Access Journals (Sweden)

    Steinfeld Israel

    2009-02-01

    Full Text Available Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. Results GOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression. GOrilla employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, GOrilla computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms. Conclusion GOrilla is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. GOrilla's unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. GOrilla is publicly available at: http://cbl-gorilla.cs.technion.ac.il

  5. Using Concepts in Literature-based Discovery: Simulating Swanson's Raynaud-Fish Oil and Migraine-Magnesium Discoveries.

    Science.gov (United States)

    Weeber, Marc; Klein, Henny; de Jong-van den Berg, Lolkje T. W.; Vos, Rein

    2001-01-01

    Proposes a two-step model of discovery in which new scientific hypotheses can be generated and subsequently tested. Applying advanced natural language processing techniques to find biomedical concepts in text, the model is implemented in a versatile interactive discovery support tool. This tool is used to successfully simulate Don R. Swanson's…

  6. Genome wide prediction of protein function via a generic knowledge discovery approach based on evidence integration

    Directory of Open Access Journals (Sweden)

    Li Yinghui

    2006-05-01

    Full Text Available Abstract Background The automation of many common molecular biology techniques has resulted in the accumulation of vast quantities of experimental data. One of the major challenges now facing researchers is how to process this data to yield useful information about a biological system (e.g. knowledge of genes and their products, and the biological roles of proteins, their molecular functions, localizations and interaction networks. We present a technique called Global Mapping of Unknown Proteins (GMUP which uses the Gene Ontology Index to relate diverse sources of experimental data by creation of an abstraction layer of evidence data. This abstraction layer is used as input to a neural network which, once trained, can be used to predict function from the evidence data of unannotated proteins. The method allows us to include almost any experimental data set related to protein function, which incorporates the Gene Ontology, to our evidence data in order to seek relationships between the different sets. Results We have demonstrated the capabilities of this method in two ways. We first collected various experimental datasets associated with yeast (Saccharomyces cerevisiae and applied the technique to a set of previously annotated open reading frames (ORFs. These ORFs were divided into training and test sets and were used to examine the accuracy of the predictions made by our method. Then we applied GMUP to previously un-annotated ORFs and made 1980, 836 and 1969 predictions corresponding to the GO Biological Process, Molecular Function and Cellular Component sub-categories respectively. We found that GMUP was particularly successful at predicting ORFs with functions associated with the ribonucleoprotein complex, protein metabolism and transportation. Conclusion This study presents a global and generic gene knowledge discovery approach based on evidence integration of various genome-scale data. It can be used to provide insight as to how certain

  7. Discovery of second gene for solid dark green versus light green rind pattern in watermelon.

    Science.gov (United States)

    Kumar, Rakesh; Wehner, Todd C

    2011-01-01

    The watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus) has high variability for fruit size, shape, rind pattern, and flesh color. This study was designed to measure the qualitative inheritance of rind phenotypes (solid dark green vs. light green). For each of the 2 families, "Mountain Hoosier" × "Minilee" and "Early Arizona" × "Minilee," 6 generations (P(a)S(1), P(b)S(1), F(1), F(2), BC(1)P(a), BC(1)P(b)) were developed. Each family was tested in summer 2008 in 3 environments in North Carolina. Phenotypic data were analyzed with the χ(2) method to test the segregation of Mendelian genes. Deviations from the expected segregation ratios based on hypothesized single dominant gene for solid dark green versus light green rind pattern were recorded, raising questions on the inheritance of this trait. Inheritance of solid dark green rind versus light (gray) rind showed duplicate dominant epistasis. Duplicate dominant epistasis gives rise to a 15:1 ratio (solid dark green:light rind pattern) in F(2) generation. When both the loci are homozygous recessive, we observe light rind pattern. The g-1 and g-2 genes were identified to control light green rind when in homozygous recessive form. PMID:21566001

  8. Affinity-Based Screening Technology and HCV Drug Discovery

    Institute of Scientific and Technical Information of China (English)

    LI Bin

    2003-01-01

    @@ NS5A is one of the non-structural gene products encoded by Hepatitis C virus (HCV) and related viruses that are essential for viral replication. The amino acid sequence of NS5A is conserved between different HCV genotypes and the primary amino acid sequence of NS5A is unique to HCV and closely related viruses. Importantly, NS5A is unrelated to any human protein. This indicates that drugs designed to block the actions of NS5A could inhibit the replication of HCV without showing toxic side effects in human host cells, thus making NS5A inhibitors ideal anti-viral drugs. However, there are presently no functional assays for this essential viral protein. Therefore, conventional high throughput screening (HTS) approaches can not be used to discover antiviral drugs against NS5A.

  9. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  10. Novel RNA-based Strategies for Therapeutic Gene Silencing

    OpenAIRE

    Sibley, Christopher R.; Seow, Yiqi; Wood, Matthew JA

    2010-01-01

    The past decade has seen intense scientific interest in non-coding RNAs. In particular, the discovery and subsequent exploitation of gene silencing via RNA interference (RNAi) has revolutionized the way in which gene expression is now studied and understood. It is now well established that post-transcriptional gene silencing (PTGS) by the microRNA (miRNA) and other RNAi-associated pathways represents an essential layer of complexity to gene regulation. Gene silencing using RNAi additionally d...

  11. New construction for expert system based on innovative knowledge discovery technology

    Institute of Scientific and Technical Information of China (English)

    YANG BingRu; SONG Wei; XU ZhangYan

    2007-01-01

    Knowledge acquisition is the bottleneck of expert system. To solve this problem, KD (D&K), which is a comprehensive knowledge discovery process model cooperating both database and knowledge base, and related technology are proposed. Then based on KD (D&K) and related technology, the new construction of Expert System based on Knowledge Discovery (ESKD) is proposed. As the key knowledge acquisition component of ESKD, KD (D&K) is composed of KDD* and KDK*. KDD*-the new process model based on double bases cooperating mechanism; KDK*- the new process model based on double-basis fusion mechanism are introduced, respectively. The overall framework of ESKD is proposed. Some sub-systems and dynamic knowledge base system are discussed. Finally, the effectiveness and advantages of ESKD are tested in a real-world agriculture database. We hope that ESKD may be useful for the new generation of expert systems.

  12. Discovery of technical methanation catalysts based on computational screening

    DEFF Research Database (Denmark)

    Sehested, Jens; Larsen, Kasper Emil; Kustov, Arkadii;

    2007-01-01

    Methanation is a classical reaction in heterogeneous catalysis and significant effort has been put into improving the industrially preferred nickel-based catalysts. Recently, a computational screening study showed that nickel-iron alloys should be more active than the pure nickel catalyst and at ...

  13. Meiosis-specific gene discovery in plants: RNA-Seq applied to isolated Arabidopsis male meiocytes

    Directory of Open Access Journals (Sweden)

    May Gregory D

    2010-12-01

    Full Text Available Abstract Background Meiosis is a critical process in the reproduction and life cycle of flowering plants in which homologous chromosomes pair, synapse, recombine and segregate. Understanding meiosis will not only advance our knowledge of the mechanisms of genetic recombination, but also has substantial applications in crop improvement. Despite the tremendous progress in the past decade in other model organisms (e.g., Saccharomyces cerevisiae and Drosophila melanogaster, the global identification of meiotic genes in flowering plants has remained a challenge due to the lack of efficient methods to collect pure meiocytes for analyzing the temporal and spatial gene expression patterns during meiosis, and for the sensitive identification and quantitation of novel genes. Results A high-throughput approach to identify meiosis-specific genes by combining isolated meiocytes, RNA-Seq, bioinformatic and statistical analysis pipelines was developed. By analyzing the studied genes that have a meiosis function, a pipeline for identifying meiosis-specific genes has been defined. More than 1,000 genes that are specifically or preferentially expressed in meiocytes have been identified as candidate meiosis-specific genes. A group of 55 genes that have mitochondrial genome origins and a significant number of transposable element (TE genes (1,036 were also found to have up-regulated expression levels in meiocytes. Conclusion These findings advance our understanding of meiotic genes, gene expression and regulation, especially the transcript profiles of MGI genes and TE genes, and provide a framework for functional analysis of genes in meiosis.

  14. Agent-based decision making through intelligent knowledge discovery

    OpenAIRE

    Fernández Caballero, Antonio; Sokolova, Marina

    2008-01-01

    Monitoring of negative effects of urban pollution and real-time decision making allow to clarify consequences upon human health. Large amounts of raw data information describe this situation, and to get knowledge from it, we apply intelligent agents. Further modeling and simulation gives the new knowledge about the tendencies of situation development and about its structure. Agent-based decision support system can help to foresee possible ways of situation development and contribute to effect...

  15. User Preferential Model for Semantic Web Service Discovery Based on Concept Lattices

    Directory of Open Access Journals (Sweden)

    R. Mohan

    2013-08-01

    Full Text Available Web service discovery is developing as a conspicuous technology both in technical and business domains. Due to the swift growth in the number of web service registries and repositories, discovering the web services based on user request is a major concern. Though progressive and cuttingedge techniques are proposed to address these challenges in conservative environment, still it is in its critical stage at the semantic discovery scenario, particularly when the quality attributes are included. The retrieved web services, at times may not be desirable for the user and the objective of the userdesirable service discovery fails. This paper addressed the issue of discovering user relevant services, which reduce the semantic gap between the services requested and the service provided using latticebased clustering technique. A service discovery model with novel similarity measure is proposed to discover user relevant services. Further to enhance the performance of the proposed discovery model, a 2- Tier User Preference Model (UPM is proposed to address the qualitative requirements of the service request. The Tier I of the UPM qualifies the QoS parameters, whereas the Tier II quantifies the qualified parameters. The test bed comprising of about a thousand services from different domains and a model UDDI are implemented to evaluate the proposed techniques in terms of precision, recall and f-measure. The merits of the proposed techniques are demonstrated by means of improved user desirability of the discovered services.

  16. A Service Discovery and Automatic Deployment Component-Based Software Infrastructure for Ubiquitous Computing

    OpenAIRE

    FLISSI, A; GRANSART, C; Merle, P.

    2005-01-01

    International audience Software applications running on mobile devices are more and more needed. These applications have strong requirements to address: device heterogeneity, limited resources, networked communications, and security. Moreover it is required to have appropriate application design, discovery, deployment, and execution paradigms. These requirements are similar to those of any ubiquitous computing application. In this paper, we present a component-based software infrastructure...

  17. Microwave-Assisted Esterification: A Discovery-Based Microscale Laboratory Experiment

    Science.gov (United States)

    Reilly, Maureen K.; King, Ryan P.; Wagner, Alexander J.; King, Susan M.

    2014-01-01

    An undergraduate organic chemistry laboratory experiment has been developed that features a discovery-based microscale Fischer esterification utilizing a microwave reactor. Students individually synthesize a unique ester from known sets of alcohols and carboxylic acids. Each student identifies the best reaction conditions given their particular…

  18. Prospects for SUSY discovery based on inclusive searches with the ATLAS detector at the LHC

    International Nuclear Information System (INIS)

    We present searches for generic SUSY models with R-parity conservation in the ATLAS detector at the LHC, based on signatures including missing transverse momentum from undetected neutralinos, multiple jets and leptons or b and tau jets. We show the corresponding discovery reach for early ATLAS data, including the effect of systematic uncertainties on the background estimate. (author)

  19. Fragment-Based Discovery of 6-Arylindazole JAK Inhibitors.

    Science.gov (United States)

    Ritzén, Andreas; Sørensen, Morten D; Dack, Kevin N; Greve, Daniel R; Jerre, Anders; Carnerup, Martin A; Rytved, Klaus A; Bagger-Bahnsen, Jesper

    2016-06-01

    Janus kinase (JAK) inhibitors are emerging as novel and efficacious drugs for treating psoriasis and other inflammatory skin disorders, but their full potential is hampered by systemic side effects. To overcome this limitation, we set out to discover soft drug JAK inhibitors for topical use. A fragment screen yielded an indazole hit that was elaborated into a potent JAK inhibitor using structure-based design. Growing the fragment by installing a phenol moiety in the 6-position afforded a greatly improved potency. Fine-tuning the substituents on the phenol and sulfonamide moieties afforded a set of compounds with lead-like properties, but they were found to be phototoxic and unstable in the presence of light. PMID:27326341

  20. Discovery of Frequent Itemsets: Frequent Item Tree-Based Approach

    Directory of Open Access Journals (Sweden)

    A.V. Senthil Kumar

    2007-05-01

    Full Text Available Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. Existing frequent pattern discovering algorithms suffer from many problems regarding the high memory dependency when mining large amount of data, computational and I/O cost. Additionally, the recursive mining process to mine these structures is also too voracious in memory resources. In this paper, we describe a more efficient algorithm for mining complete frequent itemsets from transactional databases. The suggested algorithm is partially based on FP-tree hypothesis and extracts the frequent itemsets directly from the tree. Its memory requirement, which is independent from the number of processed transactions, is another benefit of the new method. We present performance comparisons for our algorithm against the Apriori algorithm and FP-growth.

  1. Thesaurus-based disambiguation of gene symbols

    Directory of Open Access Journals (Sweden)

    Wain Hester M

    2005-06-01

    Full Text Available Abstract Background Massive text mining of the biological literature holds great promise of relating disparate information and discovering new knowledge. However, disambiguation of gene symbols is a major bottleneck. Results We developed a simple thesaurus-based disambiguation algorithm that can operate with very little training data. The thesaurus comprises the information from five human genetic databases and MeSH. The extent of the homonym problem for human gene symbols is shown to be substantial (33% of the genes in our combined thesaurus had one or more ambiguous symbols, not only because one symbol can refer to multiple genes, but also because a gene symbol can have many non-gene meanings. A test set of 52,529 Medline abstracts, containing 690 ambiguous human gene symbols taken from OMIM, was automatically generated. Overall accuracy of the disambiguation algorithm was up to 92.7% on the test set. Conclusion The ambiguity of human gene symbols is substantial, not only because one symbol may denote multiple genes but particularly because many symbols have other, non-gene meanings. The proposed disambiguation approach resolves most ambiguities in our test set with high accuracy, including the important gene/not a gene decisions. The algorithm is fast and scalable, enabling gene-symbol disambiguation in massive text mining applications.

  2. Future target-based drug discovery for tuberculosis?

    Science.gov (United States)

    Kana, Bavesh Davandra; Karakousis, Petros C; Parish, Tanya; Dick, Thomas

    2014-12-01

    New drugs that retain potency against multidrug/extensively drug-resistant strains of Mycobacterium tuberculosis, with the additional benefit of a shortened treatment duration and ease of administration, are urgently needed by tuberculosis (TB) control programs. Efforts to develop this new generation of treatment interventions have been plagued with numerous problems, the most significant being our insufficient understanding of mycobacterial metabolism during disease. This, combined with limited chemical diversity and poor entry of small molecules into the cell, has limited the number of new bioactive agents that result from drug screening efforts. The biochemical, target-driven approach to drug development has been largely abandoned in the TB field, to be replaced by whole-cell or target-based whole-cell screening approaches. In this context, the properties of a good drug target are unclear, since these are directly determined by the ability to find compounds, using current screening algorithms, which are able to kill M. tuberculosis. In this review, we discuss issues related to the identification and validation of drug targets and highlight some key properties for promising targets. Some of these include essentiality for growth, vulnerability, druggability, reduced propensity to evolve drug resistance and target location to facilitate ready access to drugs during chemotherapy. We present these in the context of recent drugs that have emerged through various approaches with the aim of consolidating the knowledge gained from these experiences to inform future efforts. PMID:25458615

  3. Climate Solutions based on advanced scientific discoveries of Allatra physics

    Science.gov (United States)

    Vershigora, Valery

    2016-05-01

    Global climate change is one of the most important international problems of the 21st century. The overall rapid increase in the dynamics of cataclysms, which have been observed in recent decades, is particularly alarming. Howdo modern scientists predict the occurrence of certain events? In meteorology, unusually powerful cumulonimbus clouds are one of the main conditions for the emergence of a tornado. The former, in their turn, are formed during the invasion of cold air on the overheated land surface. The satellite captures the cloud front, and, based on these pictures, scientists make assumptions about the possibility of occurrence of the respective natural phenomena. In fact, mankind visually observes and draws conclusions about the consequences of the physical phenomena which have already taken place in the invisible world, so the conclusions of scientists are assumptions by their nature, rather than precise knowledge of the causes of theorigin of these phenomena in the physics of microcosm. The latest research in the field of the particle physics and neutrino astrophysics, which was conducted by a working team of scientists of ALLATRA International Public Movement (hereinafter ALLATRA SCIENCE group) allatra-science.org, last accessed 10 April 2016.

  4. Climate Solutions based on advanced scientific discoveries of Allatra physics

    Directory of Open Access Journals (Sweden)

    Vershigora Valery

    2016-05-01

    Full Text Available Global climate change is one of the most important international problems of the 21st century. The overall rapid increase in the dynamics of cataclysms, which have been observed in recent decades, is particularly alarming. Howdo modern scientists predict the occurrence of certain events? In meteorology, unusually powerful cumulonimbus clouds are one of the main conditions for the emergence of a tornado. The former, in their turn, are formed during the invasion of cold air on the overheated land surface. The satellite captures the cloud front, and, based on these pictures, scientists make assumptions about the possibility of occurrence of the respective natural phenomena. In fact, mankind visually observes and draws conclusions about the consequences of the physical phenomena which have already taken place in the invisible world, so the conclusions of scientists are assumptions by their nature, rather than precise knowledge of the causes of theorigin of these phenomena in the physics of microcosm. The latest research in the field of the particle physics and neutrino astrophysics, which was conducted by a working team of scientists of ALLATRA International Public Movement (hereinafter ALLATRA SCIENCE groupallatra-science.org, last accessed 10 April 2016., offers increased opportunities for advanced fundamental and applied research in climatic engineering.

  5. Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation

    OpenAIRE

    Coppe, Alessandro; Ferrari, Francesco; Bisognin, Andrea; Danieli, Gian Antonio; Ferrari, Sergio; Bicciato, Silvio; Bortoluzzi, Stefania

    2008-01-01

    Genes co-expressed may be under similar promoter-based and/or position-based regulation. Although data on expression, position and function of human genes are available, their true integration still represents a challenge for computational biology, hampering the identification of regulatory mechanisms. We carried out an integrative analysis of genomic position, functional annotation and promoters of genes expressed in myeloid cells. Promoter analysis was conducted by a novel multi-step method...

  6. Simulation-based Discovery of Cyclic Peptide Nanotubes

    Science.gov (United States)

    Ruiz Pestana, Luis A.

    Today, there is a growing need for environmentally friendly synthetic membranes with selective transport capabilities to address some of society's most pressing issues, such as carbon dioxide pollution, or access to clean water. While conventional membranes cannot stand up to the challenge, thin nanocomposite membranes, where vertically aligned subnanometer pores (e.g. nanotubes) are embedded in a thin polymeric film, promise to overcome some of the current limitations, namely, achieving a monodisperse distribution of subnanometer size pores, vertical pore alignment across the membrane thickness, and tunability of the pore surface chemistry. Self-assembled cyclic peptide nanotubes (CPNs), are particularly promising as selective nanopores because the pore size can be controlled at the subnanometer level, exhibit high chemical design flexibility, and display remarkable mechanical stability. In addition, when conjugated with polymer chains, the cyclic peptides can co-assemble in block copolymer domains to form nanoporous thin films. CPNs are thus well positioned to tackle persistent challenges in molecular separation applications. However, our poor understanding of the physics underlying their remarkable properties prevents the rational design and implementation of CPNs in technologically relevant membranes. In this dissertation, we use a simulation-based approach, in particular molecular dynamics (MD) simulations, to investigate the critical knowledge gaps hindering the implementation of CPNs. Computational mechanical tests show that, despite the weak nature of the stabilizing hydrogen bonds and the small cross section, CPNs display a Young's modulus of approximately 20 GPa and a maximum strength of around 1 GPa, placing them among the strongest proteinaceous materials known. Simulations of the self-assembly process reveal that CPNs grow by self-similar coarsening, contrary to other low-dimensional peptide systems, such as amyloids, that are believed to grow through

  7. Network-based discovery through mechanistic systems biology. Implications for applications--SMEs and drug discovery: where the action is.

    Science.gov (United States)

    Benson, Neil

    2015-08-01

    Phase II attrition remains the most important challenge for drug discovery. Tackling the problem requires improved understanding of the complexity of disease biology. Systems biology approaches to this problem can, in principle, deliver this. This article reviews the reports of the application of mechanistic systems models to drug discovery questions and discusses the added value. Although we are on the journey to the virtual human, the length, path and rate of learning from this remain an open question. Success will be dependent on the will to invest and make the most of the insight generated along the way. PMID:26464089

  8. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. PMID:26989145

  9. An Agent-Based Focused Crawling Framework for Topic- and Genre-Related Web Document Discovery

    OpenAIRE

    Pappas, Nikolaos; Katsimpras, Georgios; Stamatatos, Efstathios

    2012-01-01

    The discovery of web documents about certain topics is an important task for web-based applications including web document retrieval, opinion mining and knowledge extraction. In this paper, we propose an agent-based focused crawling framework able to retrieve topic- and genre-related web documents. Starting from a simple topic query, a set of focused crawler agents explore in parallel topic-specific web paths using dynamic seed URLs that belong to certain web genres and are collected from web...

  10. The Discovery of Aurora Kinase Inhibitor by Multi-Docking-Based Virtual Screening

    Directory of Open Access Journals (Sweden)

    Jun-Tae Kim

    2014-11-01

    Full Text Available We report the discovery of aurora kinase inhibitor using the fragment-based virtual screening by multi-docking strategy. Among a number of fragments collected from eMololecules, we found four fragment molecules showing potent activity (>50% at 100 μM against aurora kinase. Based on the explored fragment scaffold, we selected two compounds in our synthesized library and validated the biological activity against Aurora kinase.

  11. The Discovery of Aurora Kinase Inhibitor by Multi-Docking-Based Virtual Screening

    OpenAIRE

    Jun-Tae Kim; Seo Hee Jung; Sun Young Kang; Chung-Kyu Ryu; Nam Sook Kang

    2014-01-01

    We report the discovery of aurora kinase inhibitor using the fragment-based virtual screening by multi-docking strategy. Among a number of fragments collected from eMololecules, we found four fragment molecules showing potent activity (>50% at 100 μM) against aurora kinase. Based on the explored fragment scaffold, we selected two compounds in our synthesized library and validated the biological activity against Aurora kinase.

  12. Mass Spectrometry-Based Proteomics in Molecular Diagnostics: Discovery of Cancer Biomarkers Using Tissue Culture

    OpenAIRE

    Debasish Paul; Avinash Kumar; Akshada Gajbhiye; Santra, Manas K.; Rapole Srikanth

    2013-01-01

    Accurate diagnosis and proper monitoring of cancer patients remain a key obstacle for successful cancer treatment and prevention. Therein comes the need for biomarker discovery, which is crucial to the current oncological and other clinical practices having the potential to impact the diagnosis and prognosis. In fact, most of the biomarkers have been discovered utilizing the proteomics-based approaches. Although high-throughput mass spectrometry-based proteomic approaches like SILAC, 2D-DIGE,...

  13. Discovery of potent, reversible MetAP2 inhibitors via fragment based drug discovery and structure based drug design-Part 2.

    Science.gov (United States)

    McBride, Christopher; Cheruvallath, Zacharia; Komandla, Mallareddy; Tang, Mingnam; Farrell, Pamela; Lawson, J David; Vanderpool, Darin; Wu, Yiqin; Dougan, Douglas R; Plonowski, Artur; Holub, Corine; Larson, Chris

    2016-06-15

    Methionine aminopeptidase-2 (MetAP2) is an enzyme that cleaves an N-terminal methionine residue from a number of newly synthesized proteins. This step is required before they will fold or function correctly. Pre-clinical and clinical studies with a MetAP2 inhibitor suggest that they could be used as a novel treatment for obesity. Herein we describe the discovery of a series of pyrazolo[4,3-b]indoles as reversible MetAP2 inhibitors. A fragment-based drug discovery (FBDD) approach was used, beginning with the screening of fragment libraries to generate hits with high ligand-efficiency (LE). An indazole core was selected for further elaboration, guided by structural information. SAR from the indazole series led to the design of a pyrazolo[4,3-b]indole core and accelerated knowledge-based fragment growth resulted in potent and efficient MetAP2 inhibitors, which have shown robust and sustainable body weight loss in DIO mice when dosed orally. PMID:27136719

  14. Generalization-based discovery of spatial association rules with linguistic cloud models

    Institute of Scientific and Technical Information of China (English)

    杨斌; 田永青; 朱仲英

    2004-01-01

    Extraction of interesting and general spatial association rules from large spatial databases is an important task in the development of spatial database systems. In this paper, we investigate the generalization-based knowledge discovery mechanism that integrates attribute-oriented induction on nonspatial data and spatial merging and generalization on spatial data. Furthermore, we present linguistic cloud models for knowledge representation and uncertainty handling to enhance current generalization-based method. With these models, spatial and nonspatial attribute values are well generalized at higher-concept levels, allowing discovery of strong spatial association rules. Combining the cloud model based generalization method with Apriori algorithm for mining association rules from a spatial database shows the benefits in effectiveness and flexibility.

  15. Discovery of error-tolerant biclusters from noisy gene expression data

    OpenAIRE

    Gupta Rohit; Rao Navneet; Kumar Vipin

    2011-01-01

    Abstract Background An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bi...

  16. Prior knowledge driven Granger causality analysis on gene regulatory network discovery

    OpenAIRE

    Yao, Shun; Yoo, Shinjae; Yu, Dantong

    2015-01-01

    Background Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. Results In this study, we proposed a new method, viz., CGC-2SPR (CGC ...

  17. Mass Spectrometry-Based Proteomics in Molecular Diagnostics: Discovery of Cancer Biomarkers Using Tissue Culture

    Directory of Open Access Journals (Sweden)

    Debasish Paul

    2013-01-01

    Full Text Available Accurate diagnosis and proper monitoring of cancer patients remain a key obstacle for successful cancer treatment and prevention. Therein comes the need for biomarker discovery, which is crucial to the current oncological and other clinical practices having the potential to impact the diagnosis and prognosis. In fact, most of the biomarkers have been discovered utilizing the proteomics-based approaches. Although high-throughput mass spectrometry-based proteomic approaches like SILAC, 2D-DIGE, and iTRAQ are filling up the pitfalls of the conventional techniques, still serum proteomics importunately poses hurdle in overcoming a wide range of protein concentrations, and also the availability of patient tissue samples is a limitation for the biomarker discovery. Thus, researchers have looked for alternatives, and profiling of candidate biomarkers through tissue culture of tumor cell lines comes up as a promising option. It is a rich source of tumor cell-derived proteins, thereby, representing a wide array of potential biomarkers. Interestingly, most of the clinical biomarkers in use today (CA 125, CA 15.3, CA 19.9, and PSA were discovered through tissue culture-based system and tissue extracts. This paper tries to emphasize the tissue culture-based discovery of candidate biomarkers through various mass spectrometry-based proteomic approaches.

  18. Plant noncoding RNA gene discovery by “single-genome comparative genomics”

    OpenAIRE

    Chen, Chong-Jian; Zhou, Hui; Chen, Yue-Qin; Qu, Liang-Hu; Gautheret, Daniel

    2011-01-01

    Plant genomes have undergone multiple rounds of duplications that contributed massively to the growth of gene families. The structure of resulting families has been studied in depth for protein-coding genes. However, little is known about the impact of duplications on noncoding RNA (ncRNA) genes. Here we perform a systematic analysis of duplicated regions in the rice genome in search of such ncRNA repeats. We observe that, just like their protein counterparts, most ncRNA genes have undergone ...

  19. A multi-gene transcriptional profiling approach to the discovery of cell signature markers

    OpenAIRE

    Wada, Youichiro; Li, Dan; Merley, Anne; Zukauskas, Andrew; Aird, William C.; Dvorak, Harold F.; Shih, Shou-Ching

    2010-01-01

    A profile of transcript abundances from multiple genes constitutes a molecular signature if the expression pattern is unique to one cell type. Here we measure mRNA copy numbers per cell by normalizing per million copies of 18S rRNA and identify 6 genes (TIE1, KDR, CDH5, TIE2, EFNA1 and MYO5C) out of 79 genes tested as excellent molecular signature markers for endothelial cells (ECs) in vitro. The selected genes are uniformly expressed in ECs of 4 different origins but weakly or not expressed ...

  20. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery

    Institute of Scientific and Technical Information of China (English)

    Nadia Nicole Ono; Monica Therese Britton; Joseph Nathaniel Fass; Charles Meyer Nicolet; Dawei Lin; Li Tian

    2011-01-01

    Pomegranate fruit peel is rich in bioactive plant natural products,such as hydrolyzable tannins and anthocyanins.Despite their documented roles in human nutrition and fruit quality,genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain.Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform.Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp).Candidate genes for hydrolyzable tannin,anthocyanin,flavonoid,terpenoid and fatty acid biosynthesis and/or regulation were identified.Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts.In addition,115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers.The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate.This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis,identifying genes controlling important agronomic traits,and discovering molecular markers in non-model specialty crop species.

  1. Computational discovery of Epstein-Barr virus targeted human genes and signalling pathways.

    Science.gov (United States)

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Epstein-Barr virus (EBV) plays important roles in the origin and the progression of human carcinomas, e.g. diffuse large B cell tumors, T cell lymphomas, etc. Discovering EBV targeted human genes and signaling pathways is vital to understand EBV tumorigenesis. In this study we propose a noise-tolerant homolog knowledge transfer method to reconstruct functional protein-protein interactions (PPI) networks between Epstein-Barr virus and Homo sapiens. The training set is augmented via homolog instances and the homolog noise is counteracted by support vector machine (SVM). Additionally we propose two methods to define subcellular co-localization (i.e. stringent and relaxed), based on which to further derive physical PPI networks. Computational results show that the proposed method achieves sound performance of cross validation and independent test. In the space of 648,672 EBV-human protein pairs, we obtain 51,485 functional interactions (7.94%), 869 stringent physical PPIs and 46,050 relaxed physical PPIs. Fifty-eight evidences are found from the latest database and recent literature to validate the model. This study reveals that Epstein-Barr virus interferes with normal human cell life, such as cholesterol homeostasis, blood coagulation, EGFR binding, p53 binding, Notch signaling, Hedgehog signaling, etc. The proteome-wide predictions are provided in the supplementary file for further biomedical research. PMID:27470517

  2. An integrated approach to gene discovery and marker development in Atlantic cod (Gadus morhua).

    Science.gov (United States)

    Bowman, Sharen; Hubert, Sophie; Higgins, Brent; Stone, Cynthia; Kimball, Jennifer; Borza, Tudor; Bussey, Jillian Tarrant; Simpson, Gary; Kozera, Catherine; Curtis, Bruce A; Hall, Jennifer R; Hori, Tiago S; Feng, Charles Y; Rise, Marlies; Booman, Marije; Gamperl, A Kurt; Trippel, Edward; Symonds, Jane; Johnson, Stewart C; Rise, Matthew L

    2011-04-01

    Atlantic cod is a species that has been overexploited by the capture fishery. Programs to domesticate this species are underway in several countries, including Canada, to provide an alternative route for production. Selective breeding programs have been successfully applied in the domestication of other species, with genomics-based approaches used to augment conventional methods of animal production in recent years. Genomics tools, such as gene sequences and sets of variable markers, also have the potential to enhance and accelerate selective breeding programs in aquaculture, and to provide better monitoring tools to ensure that wild cod populations are well managed. We describe the generation of significant genomics resources for Atlantic cod through an integrated genomics/selective breeding approach. These include 158,877 expressed sequence tags (ESTs), a set of annotated putative transcripts and several thousand single nucleotide polymorphism markers that were developed from, and have been shown to be highly variable in, fish enrolled in two selective breeding programs. Our EST collection was generated from various tissues and life cycle stages. In some cases, tissues from which libraries were generated were isolated from fish exposed to stressors, including elevated temperature, or antigen stimulation (bacterial and viral) to enrich for transcripts that are involved in these response pathways. The genomics resources described here support the developing aquaculture industry, enabling the application of molecular markers within selective breeding programs. Marker sets should also find widespread application in fisheries management. PMID:20396923

  3. Computational discovery of Epstein-Barr virus targeted human genes and signalling pathways

    Science.gov (United States)

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Epstein-Barr virus (EBV) plays important roles in the origin and the progression of human carcinomas, e.g. diffuse large B cell tumors, T cell lymphomas, etc. Discovering EBV targeted human genes and signaling pathways is vital to understand EBV tumorigenesis. In this study we propose a noise-tolerant homolog knowledge transfer method to reconstruct functional protein-protein interactions (PPI) networks between Epstein-Barr virus and Homo sapiens. The training set is augmented via homolog instances and the homolog noise is counteracted by support vector machine (SVM). Additionally we propose two methods to define subcellular co-localization (i.e. stringent and relaxed), based on which to further derive physical PPI networks. Computational results show that the proposed method achieves sound performance of cross validation and independent test. In the space of 648,672 EBV-human protein pairs, we obtain 51,485 functional interactions (7.94%), 869 stringent physical PPIs and 46,050 relaxed physical PPIs. Fifty-eight evidences are found from the latest database and recent literature to validate the model. This study reveals that Epstein-Barr virus interferes with normal human cell life, such as cholesterol homeostasis, blood coagulation, EGFR binding, p53 binding, Notch signaling, Hedgehog signaling, etc. The proteome-wide predictions are provided in the supplementary file for further biomedical research. PMID:27470517

  4. Handling Neighbor Discovery and Rendezvous Consistency with Weighted Quorum-Based Approach

    Directory of Open Access Journals (Sweden)

    Chung-Ming Own

    2015-09-01

    Full Text Available Neighbor discovery and the power of sensors play an important role in the formation of Wireless Sensor Networks (WSNs and mobile networks. Many asynchronous protocols based on wake-up time scheduling have been proposed to enable neighbor discovery among neighboring nodes for the energy saving, especially in the difficulty of clock synchronization. However, existing researches are divided two parts with the neighbor-discovery methods, one is the quorum-based protocols and the other is co-primality based protocols. Their distinction is on the arrangements of time slots, the former uses the quorums in the matrix, the latter adopts the numerical analysis. In our study, we propose the weighted heuristic quorum system (WQS, which is based on the quorum algorithm to eliminate redundant paths of active slots. We demonstrate the specification of our system: fewer active slots are required, the referring rate is balanced, and remaining power is considered particularly when a device maintains rendezvous with discovered neighbors. The evaluation results showed that our proposed method can effectively reschedule the active slots and save the computing time of the network system.

  5. Correlating overrepresented upstream motifs to gene expression a computational approach to regulatory element discovery in eukaryotes

    CERN Document Server

    Caselle, M; Provero, P

    2002-01-01

    Gene regulation in eukaryotes is mainly effected through transcription factors binding to rather short recognition motifs generally located upstream of the coding region. We present a novel computational method to identify regulatory elements in the upstream region of eukaryotic genes. The genes are grouped in sets sharing an overrepresented short motif in their upstream sequence. For each set, the average expression level from a microarray experiment is determined: If this level is significantly higher or lower than the average taken over the whole genome, then the overerpresented motif shared by the genes in the set is likely to play a role in their regulation. The method was tested by applying it to the genome of Saccharomyces cerevisiae, using the publicly available results of a DNA microarray experiment, in which expression levels for virtually all the genes were measured during the diauxic shift from fermentation to respiration. Several known motifs were correctly identified, and a new candidate regulat...

  6. Discovery of mitochondrial chimeric-gene associated with cytoplasmic male sterility of HL-rice

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The mitochondrial genome libraries of HL-type sterile line(A) and maintainer line(B) have been constructed.Mitochondrial gene, atp6, was used to screen libraries, due to the different Southern and Northern blot results between sterile and maintainer line. Sequencing analysis of positive clones proved that there were two copies of atp6 gene in sterile line and only one in maintainer line. One copy of atpt6 in sterile line was same to that in maintainer line; the other showed different flanking sequence from the 49th nucleotide downstream of the termination codon of atp6 gene. A new chimeric gene, orfH79, was found in the region. OrfH79 had homology to mitochondrial gene coxⅡ and orfl07, and was special to HL-sterile cytoplasm.``

  7. Natural Genetic Variation in Cassava (Manihot esculenta Crantz) Landraces: A Tool for Gene Discovery

    International Nuclear Information System (INIS)

    Cassava landraces are the earliest form of the modern cultivars and represent the first step in cassava domestication. Our forward genetic analysis uses this resource to discover spontaneous mutations in the sucrose/ starch and carotenoid synthesis/accumulation and to develop both an evolutionary and breeding perspective of gene function related to those traits. Biochemical phenotype variants for the synthesis and accumulation of carotenoid, free sugar and starch were identified. Six subtractive cDNA libraries were prepared to construct a high quality (phred > 20) EST database with 1,645 entries. Macroarray and micro-array analysis was performed to identify differentially expressed genes aiming to identify candidate genes related to sugary phenotype and carotenoid diversity. cDNA sequence for gene coding for specific enzymes in the two pathways was obtained. Gene expression analysis for coding specific enzymes was performed by RNA blot and Real Time PCR analysis. Chromoplast-associated proteins of yellow storage root were fractionated and a peptide sequence database with 906 entries sequences (MASCOT validated) was constructed. For the sucrose/starch, metabolism a sugary class of cassava was identified, carrying a mutation in the BEI and GBSS genes. For the pigmented cassava, a pink color phenotype showed absence of expression of the gene CasLYB, while an intense yellow phenotype showed a down regulation of the gene CasHYb. Heat shock proteins were identified as the major proteins associated with carotenoid. Genetic diversity for the GBSS gene in the natural population identified 22 haplotypes and a large nucleotide diversity in four subsets of population. Single segregating population derived from F2, half-sibling and S1 population showed segregation for sugary phenotype (93% of individuals), waxy phenotype (38% of individuals) and glycogen like starch (2% of individuals). Here we summarize our current results for the genetic analysis of these variants and recent

  8. Building Viewpoints in an Object-Based Representation System for Knowledge Discovery in Databases

    OpenAIRE

    Simon, Arnaud; Napoli, Amedeo

    1999-01-01

    In this paper, we present an approach to knowledge discovery in databases in the context of object-based representation systems. The goal of this approach is to extract viewpoints and association rules from data represented by objects. A viewpoint is a hierarchy of classes (a kind of partial lattice) and an association rule can be defined within a viewpoint or between two classes lying in different viewpoints. The viewpoints construction algorithm allows to manipulate objects which are indiff...

  9. Gun Possession among American Youth: A Discovery-Based Approach to Understand Gun Violence

    OpenAIRE

    Kelly V Ruggles; Sonali Rajan

    2014-01-01

    OBJECTIVE: To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions' Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. RESULTS: Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001-2011). Specifically, we automated odds ratio cal...

  10. Gun Possession among American Youth: A Discovery-Based Approach to Understand Gun Violence

    OpenAIRE

    Ruggles, Kelly V.; Rajan, Sonali

    2014-01-01

    Objective To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions’ Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. Results Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001–2011). Specifically, we automated odds ratio calcu...

  11. A P2P Service Discovery Strategy Based on Content Catalogues

    OpenAIRE

    Huang, Lican

    2007-01-01

    This paper presents a framework for distributed service discovery based on VIRGO P2P technologies. The services are classified as multi-layer, hierarchical catalogue domains according to their contents. The service providers, which have their own service registries such as UDDIs, register the services they provide and establish a virtual tree in a VIRGO network according to the domain of their service. The service location done by the proposed strategy is effective and guaranteed. This paper ...

  12. Application of mass spectrometry-based proteomics for biomarker discovery in neurological disorders

    OpenAIRE

    Venugopal Abhilash; Chaerkady Raghothama; Pandey Akhilesh

    2009-01-01

    Mass spectrometry-based quantitative proteomics has emerged as a powerful approach that has the potential to accelerate biomarker discovery, both for diagnostic as well as therapeutic purposes. Proteomics has traditionally been synonymous with 2D gels but is increasingly shifting to the use of gel-free systems and liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS). Quantitative proteomic approaches have already been applied to investigate various neurological disorders, espe...

  13. A multi-gene transcriptional profiling approach to the discovery of cell signature markers.

    Science.gov (United States)

    Wada, Youichiro; Li, Dan; Merley, Anne; Zukauskas, Andrew; Aird, William C; Dvorak, Harold F; Shih, Shou-Ching

    2011-01-01

    A profile of transcript abundances from multiple genes constitutes a molecular signature if the expression pattern is unique to one cell type. Here we measure mRNA copy numbers per cell by normalizing per million copies of 18S rRNA and identify 6 genes (TIE1, KDR, CDH5, TIE2, EFNA1 and MYO5C) out of 79 genes tested as excellent molecular signature markers for endothelial cells (ECs) in vitro. The selected genes are uniformly expressed in ECs of 4 different origins but weakly or not expressed in 4 non-EC cell lines. A multi-gene transcriptional profile of these 6 genes clearly distinguishes ECs from non-ECs in vitro. We conclude that (i) a profile of mRNA copy numbers per cell from a well-chosen multi-gene panel can act as a sensitive and accurate cell type signature marker, and (ii) the method described here can be applied to in vivo cell fingerprinting and molecular diagnosis. PMID:20972619

  14. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    Science.gov (United States)

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence. PMID:27525940

  15. GeneMesh: a web-based microarray analysis tool for relating differentially expressed genes to MeSH terms

    Directory of Open Access Journals (Sweden)

    Argraves W Scott

    2010-04-01

    Full Text Available Abstract Background An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Results Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc. or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.. Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. Conclusions GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies

  16. Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data

    Directory of Open Access Journals (Sweden)

    Gruber Stephen B

    2005-02-01

    Full Text Available Abstract Background A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tightly controlled calibration data sets as the basis for comparison. Here we compare performances for seven processing methods using two data sets originally collected for disease profiling studies. An emphasis is placed on understanding sensitivity for detecting differentially expressed genes in terms of two key statistical determinants: test statistic variability for non-differentially expressed genes, and test statistic size for truly differentially expressed genes. Results In the two data sets considered here, up to seven-fold variation across the processing methods was found in the number of genes detected at a given false discovery rate (FDR. The best performing methods called up to 90% of the same genes differentially expressed, had less variable test statistics under randomization, and had a greater number of large test statistics in the experimental data. Poor performance of one method was directly tied to a tendency to produce highly variable test statistic values under randomization. Based on an overall measure of performance, two of the seven methods (Dchip and a trimmed mean approach are superior in the two data sets considered here. Two other methods (MAS5 and GCRMA-EB are inferior, while results for the other three methods are mixed. Conclusions Choice of processing method has a major impact on differential expression analysis of microarray data. Previously reported performance analyses using tightly controlled calibration data sets are not highly consistent with results reported here using data from human tissue samples. Performance of array processing methods in disease profiling and other realistic biological studies should be

  17. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue

    Directory of Open Access Journals (Sweden)

    Jacob Kruger Jensen

    2013-06-01

    Full Text Available The exact biochemical steps of xylan backbone synthesis remain elusive. In Arabidopsis, three non-redundant genes from two glycosyltransferase (GT families, IRX9 and IRX14 from GT43 and IRX10 from GT47, are candidates for forming the xylan backbone. In other plants, evidence exists that different tissues express these three genes at widely different levels, which suggests that diversity in the makeup of the xylan synthase complex exists. Recently we have profiled the transcripts present in the developing mucilaginous tissue of psyllium (Plantago ovata Forsk. This tissue was found to have high expression levels of an IRX10 homolog, but very low levels of the two GT43 family members. This contrasts with recent wheat endosperm tissue profiling that found a relatively high abundance of the GT43 family members. We have performed an in-depth analysis of all GTs genes expressed in four developmental stages of the psyllium mucilagenous layer and in a single stage of the psyllium stem using RNA-Seq. This analysis revealed several IRX10 homologs, an expansion in GT61 (homologs of At3g18170/At3g18180, and several GTs from other GT families that are highly abundant and specifically expressed in the mucilaginous tissue. Our current hypothesis is that the four IRX10 genes present in the mucilagenous tissues have evolved to function without the GT43 genes. These four genes represent some of the most divergent IRX10 genes identified to date. Conversely, those present in the psyllium stem are very similar to those in other eudicots. This suggests these genes are under selective pressure, likely due to the synthesis of the various xylan structures present in mucilage that has a different biochemical role than that present in secondary walls. The numerous GT61 family members also show a wide sequence diversity and may be responsible for the larger number of side chain structures present in the psyllium mucilage.

  18. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue.

    Science.gov (United States)

    Jensen, Jacob K; Johnson, Nathan; Wilkerson, Curtis G

    2013-01-01

    The exact biochemical steps of xylan backbone synthesis remain elusive. In Arabidopsis, three non-redundant genes from two glycosyltransferase (GT) families, IRX9 and IRX14 from GT43 and IRX10 from GT47, are candidates for forming the xylan backbone. In other plants, evidence exists that different tissues express these three genes at widely different levels, which suggests that diversity in the makeup of the xylan synthase complex exists. Recently we have profiled the transcripts present in the developing mucilaginous tissue of psyllium (Plantago ovata Forsk). This tissue was found to have high expression levels of an IRX10 homolog, but very low levels of the two GT43 family members. This contrasts with recent wheat endosperm tissue profiling that found a relatively high abundance of the GT43 family members. We have performed an in-depth analysis of all GTs genes expressed in four developmental stages of the psyllium mucilagenous layer and in a single stage of the psyllium stem using RNA-Seq. This analysis revealed several IRX10 homologs, an expansion in GT61 (homologs of At3g18170/At3g18180), and several GTs from other GT families that are highly abundant and specifically expressed in the mucilaginous tissue. Our current hypothesis is that the four IRX10 genes present in the mucilagenous tissues have evolved to function without the GT43 genes. These four genes represent some of the most divergent IRX10 genes identified to date. Conversely, those present in the psyllium stem are very similar to those in other eudicots. This suggests these genes are under selective pressure, likely due to the synthesis of the various xylan structures present in mucilage that has a different biochemical role than that present in secondary walls. The numerous GT61 family members also show a wide sequence diversity and may be responsible for the larger number of side chain structures present in the psyllium mucilage. PMID:23761806

  19. Analysis of cassava (Manihot esculenta) ESTs: A tool for the discovery of genes

    International Nuclear Information System (INIS)

    Cassava (Manihot esculenta) is the main source of calories for more than 1,000 millions of people around the world and has been consolidated as the fourth most important crop after rice, corn and wheat. Cassava is considered tolerant to abiotic and biotic stress conditions; nevertheless these characteristics are mainly present in non-commercial varieties. Genetic breeding strategies represent an alternative to introduce the desirable characteristics into commercial varieties. A fundamental step for accelerating the genetic breeding process in cassava requires the identification of genes associated to these characteristics. One rapid strategy for the identification of genes is the possibility to have a large collection of ESTs (expressed sequence tag). In this study, a complete analysis of cassava ESTs was done. The cassava ESTs represent 80,459 sequences which were assembled in a set of 29,231 unique genes (unigen), comprising 10,945 contigs and 18,286 singletones. These 29,231 unique genes represent about 80% of the genes of the cassava's genome. Between 5% and 10% of the unigenes of cassava not show similarity to any sequences present in the NCBI database and could be consider as cassava specific genes. a functional category was assigned to a group of sequences of the unigen set (29%) following the Gene Ontology Vocabulary. the molecular function component was the best represented with 43% of the sequences, followed by the biological process component (38%) and finally the cellular component with 19%. in the cassava ESTs collection, 3,709 microsatellites were identified and they could be used as molecular markers. this study represents an important contribution to the knowledge of the functional genomic structure of cassava and constitutes an important tool for the identification of genes associated to agricultural characteristics of interest that could be employed in cassava breeding programs.

  20. Stability-based comparison of class discovery methods for DNA copy number profiles.

    Directory of Open Access Journals (Sweden)

    Isabel Brito

    Full Text Available MOTIVATION: Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input data representations for array-CGH, dissimilarity measures between tumor samples and clustering algorithms may be used for this purpose. The choice between procedures is often difficult. An evaluation procedure is therefore required to select the best class discovery method (combination of one input data representation, one dissimilarity measure and one clustering algorithm for array-CGH. Robustness of the resulting classes is a common requirement, but no stability-based comparison of class discovery methods for array-CGH profiles has ever been reported. RESULTS: We applied several class discovery methods and evaluated the stability of their solutions, with a modified version of Bertoni's [Formula: see text]-based test [1]. Our version relaxes the assumption of independency required by original Bertoni's [Formula: see text]-based test. We conclude that Minimal Regions of alteration (a concept introduced by [2] for input data representation, sim [3] or agree [4] for dissimilarity measure and the use of average group distance in the clustering algorithm produce the most robust classes of array-CGH profiles. AVAILABILITY: The software is available from http://bioinfo.curie.fr/projects/cgh-clustering. It has also been partly integrated into "Visualization and analysis of array-CGH"(VAMP[5]. The data sets used are publicly available from ACTuDB [6].

  1. Discovery and characterization of novel vascular and hematopoietic genes downstream of etsrp in zebrafish.

    Directory of Open Access Journals (Sweden)

    Gustavo A Gomez

    Full Text Available The transcription factor Etsrp is required for vasculogenesis and primitive myelopoiesis in zebrafish. When ectopically expressed, etsrp is sufficient to induce the expression of many vascular and myeloid genes in zebrafish. The mammalian homolog of etsrp, ER71/Etv2, is also essential for vascular and hematopoietic development. To identify genes downstream of etsrp, gain-of-function experiments were performed for etsrp in zebrafish embryos followed by transcription profile analysis by microarray. Subsequent in vivo expression studies resulted in the identification of fourteen genes with blood and/or vascular expression, six of these being completely novel. Regulation of these genes by etsrp was confirmed by ectopic induction in etsrp overexpressing embryos and decreased expression in etsrp deficient embryos. Additional functional analysis of two newly discovered genes, hapln1b and sh3gl3, demonstrates their importance in embryonic vascular development. The results described here identify a group of genes downstream of etsrp likely to be critical for vascular and/or myeloid development.

  2. Discovery and Characterization of Two Novel Salt-Tolerance Genes in Puccinellia tenuiflora

    Directory of Open Access Journals (Sweden)

    Ying Li

    2014-09-01

    Full Text Available Puccinellia tenuiflora is a monocotyledonous halophyte that is able to survive in extreme saline soil environments at an alkaline pH range of 9–10. In this study, we transformed full-length cDNAs of P. tenuiflora into Saccharomyces cerevisiae by using the full-length cDNA over-expressing gene-hunting system to identify novel salt-tolerance genes. In all, 32 yeast clones overexpressing P. tenuiflora cDNA were obtained by screening under NaCl stress conditions; of these, 31 clones showed stronger tolerance to NaCl and were amplified using polymerase chain reaction (PCR and sequenced. Four novel genes encoding proteins with unknown function were identified; these genes had no homology with genes from higher plants. Of the four isolated genes, two that encoded proteins with two transmembrane domains showed the strongest resistance to 1.3 M NaCl. RT-PCR and northern blot analysis of P. tenuiflora cultured cells confirmed the endogenous NaCl-induced expression of the two proteins. Both of the proteins conferred better tolerance in yeasts to high salt, alkaline and osmotic conditions, some heavy metals and H2O2 stress. Thus, we inferred that the two novel proteins might alleviate oxidative and other stresses in P. tenuiflora.

  3. Transcriptome analysis and discovery of genes involved in immune pathways from hepatopancreas of microbial challenged mitten crab Eriocheir sinensis.

    Directory of Open Access Journals (Sweden)

    Xihong Li

    Full Text Available BACKGROUND: The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq technology provides a powerful and efficient method for transcript analysis and immune gene discovery. METHODS/PRINCIPAL FINDINGS: A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 10(8 cfu·mL(-1 was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr database. For function classification and pathway assignment, 18,734 (36.00% unigenes were categorized to three Gene Ontology (GO categories, 12,243 (23.51% were classified to 25 Clusters of Orthologous Groups (COG, and 8,983 (17.25% were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. CONCLUSIONS/SIGNIFICANCE: This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab.

  4. Discoidin domain receptor 1 (DDR1) kinase as target for structure-based drug discovery.

    Science.gov (United States)

    Kothiwale, Sandeepkumar; Borza, Corina M; Lowe, Edward W; Pozzi, Ambra; Meiler, Jens

    2015-02-01

    Discoidin domain receptor (DDR) 1 and 2 are transmembrane receptors that belong to the family of receptor tyrosine kinases (RTK). Upon collagen binding, DDRs transduce cellular signaling involved in various cell functions, including cell adhesion, proliferation, differentiation, migration, and matrix homeostasis. Altered DDR function resulting from either mutations or overexpression has been implicated in several types of disease, including atherosclerosis, inflammation, cancer, and tissue fibrosis. Several established inhibitors, such as imatinib, dasatinib, and nilotinib, originally developed as Abelson murine leukemia (Abl) kinase inhibitors, have been found to inhibit DDR kinase activity. As we review here, recent discoveries of novel inhibitors and their co-crystal structure with the DDR1 kinase domain have made structure-based drug discovery for DDR1 amenable. PMID:25284748

  5. Research on Hotspot Discovery in Internet Public Opinions Based on Improved -Means

    Directory of Open Access Journals (Sweden)

    Gensheng Wang

    2013-01-01

    Full Text Available How to discover hotspot in the Internet public opinions effectively is a hot research field for the researchers related which plays a key role for governments and corporations to find useful information from mass data in the Internet. An improved -means algorithm for hotspot discovery in internet public opinions is presented based on the analysis of existing defects and calculation principle of original -means algorithm. First, some new methods are designed to preprocess website texts, select and express the characteristics of website texts, and define the similarity between two website texts, respectively. Second, clustering principle and the method of initial classification centers selection are analyzed and improved in order to overcome the limitations of original -means algorithm. Finally, the experimental results verify that the improved algorithm can improve the clustering stability and classification accuracy of hotspot discovery in internet public opinions when used in practice.

  6. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    Science.gov (United States)

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. PMID:26269093

  7. Polymorphism discovery and association analyses of the interferon genes in type 1 diabetes

    Directory of Open Access Journals (Sweden)

    Lam Alex C

    2006-02-01

    Full Text Available Abstract Background The aetiology of the autoimmune disease type 1 diabetes (T1D involves many genetic and environmental factors. Evidence suggests that innate immune responses, including the action of interferons, may also play a role in the initiation and/or pathogenic process of autoimmunity. In the present report, we have adopted a linkage disequilibrium (LD mapping approach to test for an association between T1D and three regions encompassing 13 interferon alpha (IFNA genes, interferon omega-1 (IFNW1, interferon beta-1 (IFNB1, interferon gamma (IFNG and the interferon consensus-sequence binding protein 1 (ICSBP1. Results We identified 238 variants, most, single nucleotide polymorphisms (SNPs, by sequencing IFNA, IFNB1, IFNW1 and ICSBP1, 98 of which where novel when compared to dbSNP build 124. We used polymorphisms identified in the SeattleSNP database for INFG. A set of tag SNPs was selected for each of the interferon and interferon-related genes to test for an association between T1D and this complex gene family. A total of 45 tag SNPs were selected and genotyped in a collection of 472 multiplex families. Conclusion We have developed informative sets of SNPs for the interferon and interferon related genes. No statistical evidence of a major association between T1D and any of the interferon and interferon related genes tested was found.

  8. Discovery of molecular mechanisms of traditional Chinese medicinal formula Si-Wu-Tang using gene expression microarray and connectivity map.

    Directory of Open Access Journals (Sweden)

    Zhining Wen

    Full Text Available To pursue a systematic approach to discovery of mechanisms of action of traditional Chinese medicine (TCM, we used microarrays, bioinformatics and the "Connectivity Map" (CMAP to examine TCM-induced changes in gene expression. We demonstrated that this approach can be used to elucidate new molecular targets using a model TCM herbal formula Si-Wu-Tang (SWT which is widely used for women's health. The human breast cancer MCF-7 cells treated with 0.1 µM estradiol or 2.56 mg/ml of SWT showed dramatic gene expression changes, while no significant change was detected for ferulic acid, a known bioactive compound of SWT. Pathway analysis using differentially expressed genes related to the treatment effect identified that expression of genes in the nuclear factor erythroid 2-related factor 2 (Nrf2 cytoprotective pathway was most significantly affected by SWT, but not by estradiol or ferulic acid. The Nrf2-regulated genes HMOX1, GCLC, GCLM, SLC7A11 and NQO1 were upregulated by SWT in a dose-dependent manner, which was validated by real-time RT-PCR. Consistently, treatment with SWT and its four herbal ingredients resulted in an increased antioxidant response element (ARE-luciferase reporter activity in MCF-7 and HEK293 cells. Furthermore, the gene expression profile of differentially expressed genes related to SWT treatment was used to compare with those of 1,309 compounds in the CMAP database. The CMAP profiles of estradiol-treated MCF-7 cells showed an excellent match with SWT treatment, consistent with SWT's widely claimed use for women's diseases and indicating a phytoestrogenic effect. The CMAP profiles of chemopreventive agents withaferin A and resveratrol also showed high similarity to the profiles of SWT. This study identified SWT as an Nrf2 activator and phytoestrogen, suggesting its use as a nontoxic chemopreventive agent, and demonstrated the feasibility of combining microarray gene expression profiling with CMAP mining to discover mechanisms

  9. Yeast homologous recombination-based promoter engineering for the activation of silent natural product biosynthetic gene clusters.

    Science.gov (United States)

    Montiel, Daniel; Kang, Hahk-Soo; Chang, Fang-Yuan; Charlop-Powers, Zachary; Brady, Sean F

    2015-07-21

    Large-scale sequencing of prokaryotic (meta)genomic DNA suggests that most bacterial natural product gene clusters are not expressed under common laboratory culture conditions. Silent gene clusters represent a promising resource for natural product discovery and the development of a new generation of therapeutics. Unfortunately, the characterization of molecules encoded by these clusters is hampered owing to our inability to express these gene clusters in the laboratory. To address this bottleneck, we have developed a promoter-engineering platform to transcriptionally activate silent gene clusters in a model heterologous host. Our approach uses yeast homologous recombination, an auxotrophy complementation-based yeast selection system and sequence orthogonal promoter cassettes to exchange all native promoters in silent gene clusters with constitutively active promoters. As part of this platform, we constructed and validated a set of bidirectional promoter cassettes consisting of orthogonal promoter sequences, Streptomyces ribosome binding sites, and yeast selectable marker genes. Using these tools we demonstrate the ability to simultaneously insert multiple promoter cassettes into a gene cluster, thereby expediting the reengineering process. We apply this method to model active and silent gene clusters (rebeccamycin and tetarimycin) and to the silent, cryptic pseudogene-containing, environmental DNA-derived Lzr gene cluster. Complete promoter refactoring and targeted gene exchange in this "dead" cluster led to the discovery of potent indolotryptoline antiproliferative agents, lazarimides A and B. This potentially scalable and cost-effective promoter reengineering platform should streamline the discovery of natural products from silent natural product biosynthetic gene clusters. PMID:26150486

  10. Phenotype discovery by gene expression profiling: mapping of biological processes linked to BMP-2-mediated osteoblast differentiation.

    Science.gov (United States)

    Balint, Eva; Lapointe, David; Drissi, Hicham; van der Meijden, Caroline; Young, Daniel W; van Wijnen, Andre J; Stein, Janet L; Stein, Gary S; Lian, Jane B

    2003-05-15

    osteogenic phenotype is recognized by 8 h, reflected by downregulation of most myogenic-related genes and induction of a spectrum of signaling proteins and enzymes facilitating synthesis and assembly of an extracellular skeletal environment. These genes included collagens Type I and VI and the small leucine rich repeat family of proteoglycans (e.g., decorin, biglycan, osteomodulin, fibromodulin, and osteoadherin/osteoglycin) that reached peak expression at 24 h. With extracellular matrix development, the bone phenotype was further established from 16 to 24 h by induction of genes for cell adhesion and communication and enzymes that organize the bone ECM. Our microarray analysis resulted in the discovery of a class of genes, initially described in relation to differentiation of astrocytes and oligodendrocytes that are functionally coupled to signals for cellular extensions. They include nexin, neuropilin, latexin, neuroglian, neuron specific gene 1, and Ulip; suggesting novel roles for these genes in the bone microenvironment. This global analysis identified a multistage molecular and cellular cascade that supports BMP-2-mediated osteoblast differentiation. PMID:12704803

  11. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    Science.gov (United States)

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  12. Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

    Science.gov (United States)

    Gligorijevic, Djordje; Stojanovic, Jelena; Djuric, Nemanja; Radosavljevic, Vladan; Grbovic, Mihajlo; Kulathinal, Rob J; Obradovic, Zoran

    2016-01-01

    Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies. PMID:27578529

  13. Transcriptome analysis of Catharanthus roseus for gene discovery and expression profiling.

    Science.gov (United States)

    Verma, Mohit; Ghangal, Rajesh; Sharma, Raghvendra; Sinha, Alok K; Jain, Mukesh

    2014-01-01

    The medicinal plant, Catharanthus roseus, accumulates wide range of terpenoid indole alkaloids, which are well documented therapeutic agents. In this study, deep transcriptome sequencing of C. roseus was carried out to identify the pathways and enzymes (genes) involved in biosynthesis of these compounds. About 343 million reads were generated from different tissues (leaf, flower and root) of C. roseus using Illumina platform. Optimization of de novo assembly involving a two-step process resulted in a total of 59,220 unique transcripts with an average length of 1284 bp. Comprehensive functional annotation and gene ontology (GO) analysis revealed the representation of many genes involved in different biological processes and molecular functions. In total, 65% of C. roseus transcripts showed homology with sequences available in various public repositories, while remaining 35% unigenes may be considered as C. roseus specific. In silico analysis revealed presence of 11,620 genic simple sequence repeats (excluding mono-nucleotide repeats) and 1820 transcription factor encoding genes in C. roseus transcriptome. Expression analysis showed roots and leaves to be actively participating in bisindole alkaloid production with clear indication that enzymes involved in pathway of vindoline and vinblastine biosynthesis are restricted to aerial tissues. Such large-scale transcriptome study provides a rich source for understanding plant-specialized metabolism, and is expected to promote research towards production of plant-derived pharmaceuticals. PMID:25072156

  14. Transcriptome analysis of Catharanthus roseus for gene discovery and expression profiling.

    Directory of Open Access Journals (Sweden)

    Mohit Verma

    Full Text Available The medicinal plant, Catharanthus roseus, accumulates wide range of terpenoid indole alkaloids, which are well documented therapeutic agents. In this study, deep transcriptome sequencing of C. roseus was carried out to identify the pathways and enzymes (genes involved in biosynthesis of these compounds. About 343 million reads were generated from different tissues (leaf, flower and root of C. roseus using Illumina platform. Optimization of de novo assembly involving a two-step process resulted in a total of 59,220 unique transcripts with an average length of 1284 bp. Comprehensive functional annotation and gene ontology (GO analysis revealed the representation of many genes involved in different biological processes and molecular functions. In total, 65% of C. roseus transcripts showed homology with sequences available in various public repositories, while remaining 35% unigenes may be considered as C. roseus specific. In silico analysis revealed presence of 11,620 genic simple sequence repeats (excluding mono-nucleotide repeats and 1820 transcription factor encoding genes in C. roseus transcriptome. Expression analysis showed roots and leaves to be actively participating in bisindole alkaloid production with clear indication that enzymes involved in pathway of vindoline and vinblastine biosynthesis are restricted to aerial tissues. Such large-scale transcriptome study provides a rich source for understanding plant-specialized metabolism, and is expected to promote research towards production of plant-derived pharmaceuticals.

  15. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses.

    Science.gov (United States)

    Jiao, Yinping; Burke, John; Chopra, Ratan; Burow, Gloria; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves; Ware, Doreen; Xin, Zhanguo

    2016-07-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches. PMID:27354556

  16. Discovery, linkage disequilibrium and association analyses of polymorphisms of the immune complement inhibitor, decay-accelerating factor gene (DAF/CD55 in type 1 diabetes

    Directory of Open Access Journals (Sweden)

    Smink Luc J

    2006-04-01

    Full Text Available Abstract Background Type 1 diabetes (T1D is a common autoimmune disease resulting from T-cell mediated destruction of pancreatic beta cells. Decay accelerating factor (DAF, CD55, a glycosylphosphatidylinositol-anchored membrane protein, is a candidate for autoimmune disease susceptibility based on its role in restricting complement activation and evidence that DAF expression modulates the phenotype of mice models for autoimmune disease. In this study, we adopt a linkage disequilibrium (LD mapping approach to test for an association between the DAF gene and T1D. Results Initially, we used HapMap II genotype data to examine LD across the DAF region. Additional resequencing was required, identifying 16 novel polymorphisms. Combining both datasets, a LD mapping approach was adopted to test for association with T1D. Seven tag SNPs were selected and genotyped in case-control (3,523 cases and 3,817 controls and family (725 families collections. Conclusion We obtained no evidence of association between T1D and the DAF region in two independent collections. In addition, we assessed the impact of using only HapMap II genotypes for the selection of tag SNPs and, based on this study, found that HapMap II genotypes may require additional SNP discovery for comprehensive LD mapping of some genes in common disease.

  17. Gene expression and epigenetic discovery screen reveal methylation of SFRP2 in prostate cancer.

    LENUS (Irish Health Repository)

    Perry, Antoinette S

    2013-04-15

    Aberrant activation of Wnts is common in human cancers, including prostate. Hypermethylation associated transcriptional silencing of Wnt antagonist genes SFRPs (Secreted Frizzled-Related Proteins) is a frequent oncogenic event. The significance of this is not known in prostate cancer. The objectives of our study were to (i) profile Wnt signaling related gene expression and (ii) investigate methylation of Wnt antagonist genes in prostate cancer. Using TaqMan Low Density Arrays, we identified 15 Wnt signaling related genes with significantly altered expression in prostate cancer; the majority of which were upregulated in tumors. Notably, histologically benign tissue from men with prostate cancer appeared more similar to tumor (r = 0.76) than to benign prostatic hyperplasia (BPH; r = 0.57, p < 0.001). Overall, the expression profile was highly similar between tumors of high (≥ 7) and low (≤ 6) Gleason scores. Pharmacological demethylation of PC-3 cells with 5-Aza-CdR reactivated 39 genes (≥ 2-fold); 40% of which inhibit Wnt signaling. Methylation frequencies in prostate cancer were 10% (2\\/20) (SFRP1), 64.86% (48\\/74) (SFRP2), 0% (0\\/20) (SFRP4) and 60% (12\\/20) (SFRP5). SFRP2 methylation was detected at significantly lower frequencies in high-grade prostatic intraepithelial neoplasia (HGPIN; 30%, (6\\/20), p = 0.0096), tumor adjacent benign areas (8.82%, (7\\/69), p < 0.0001) and BPH (11.43% (4\\/35), p < 0.0001). The quantitative level of SFRP2 methylation (normalized index of methylation) was also significantly higher in tumors (116) than in the other samples (HGPIN = 7.45, HB = 0.47, and BPH = 0.12). We show that SFRP2 hypermethylation is a common event in prostate cancer. SFRP2 methylation in combination with other epigenetic markers may be a useful biomarker of prostate cancer.

  18. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Directory of Open Access Journals (Sweden)

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  19. A REGISTRY BASED DISCOVERY MECHANISM FOR E-LEARNING WEB SERVICES

    Directory of Open Access Journals (Sweden)

    Demian Antony D’Mello

    2012-10-01

    Full Text Available E-learning is currently taking the shape of a Web Service in various applications i.e. learners can search for suitable content, book it, pay for it and consume it. This paper shows how the search aspects for e-learning content can technically be combined with the recent standardization efforts that aim at content exchangeability and efficient reuse. A repository for learning object publication and search is proposed that essentially adapts the UDDI framework used in commercial Web Services to the e-learning context. To adopt Web Services technology towards the reusability and aggregation of e-learning services, the conceptual Web Services architecture and its building blocks need to be augmented. The objective of this research is to design broker based registry architecture for e- Web services which facilitates effective elearning content/service discovery for the consumption or composition. The implementation followed by experimentation showed that, the proposed e-learning discovery architecture facilitates effective discovery with moderate performance in terms of overall response.

  20. Fuzzy-Based Knowledge Discovery from Heterogeneous Data in Planting Systems for Elderly LOHAS

    Institute of Scientific and Technical Information of China (English)

    Hung-Chih Hsueh; Jung-Yi Jiang; Jen-Sheng Tsai; Wen-Hao Tsai; Kuan-Rong Lee; Yau-Hwang Kuo

    2015-01-01

    Abstract⎯In this paper, we propose a knowledge discovery method based on the fuzzy set theory to help elders with plant cultivation. Initially, the fuzzy sets are constructed by using the feature selection and statistical interval estimation. The min-max inference and the center of gravity defuzzification method are then used to output a candidate pattern set. Finally, a pattern discovery is adopted to obtain the patterns from the candidate set for the cultivation suggestions by considering the frequency weight and user’s experience. In order to demonstrate the performance of our method in planting systems, we conduct a clicks-and-mortar cultivation platform, namely Eden Garden, for the elderly lifestyles of health and sustainability (LOHAS). The experimental results show that the accuracy rate of our knowledge discovery method can reach up to 85%. Moreover, the results of the LOHAS index scale table present that the happiness of the elders is increasing while the elders are using our proposed method.

  1. Context-aware computing-based reducing cost of service method in resource discovery and interaction

    Institute of Scientific and Technical Information of China (English)

    TANG Shan-cheng; HOU Yi-bin

    2004-01-01

    Reducing cost of service is an important goal for resource discovery and interaction technologies. The shortcomings of transhipment-method and hibernation-method are to increase holistic cost of service and to slower resource discovery respectively. To overcome these shortcomings, a context-aware computing-based method is developed. This method, firstly,analyzes the courses of devices using resource discovery and interaction technologies to identify some types of context related to reducing cost of service, then, chooses effective methods such as stopping broadcast and hibernation to reduce cost of service according to information supplied by the context but not the transhipment-method's simple hibernations. The results of experiments indicate that under the worst condition this method overcomes the shortcomings of transhipment-method, makes the "poor" devices hibernate longer than hibernation-method to reduce cost of service more effectively, and discovers resources faster than hibernation-method; under the best condition it is far better than hibernation-method in all aspects.

  2. rVISTA for Comparative Sequence-Based Discovery of Functional Transcription Factor Binding Sites

    Energy Technology Data Exchange (ETDEWEB)

    Loots, Gabriela G.; Ovcharenko, Ivan; Pachter, Lior; Dubchak, Inna; Rubin, Edward M.

    2002-03-08

    Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVISTA, for high-throughput discovery of cis-regulatory elements that combines transcription factor binding site prediction and the analysis of inter-species sequence conservation. Here, we illustrate the ability of rVISTA to identify true transcription factor binding sites through the analysis of AP-1 and NFAT binding sites in the 1 Mb well-annotated cytokine gene cluster1 (Hs5q31; Mm11). The exploitation of orthologous human-mouse data set resulted in the elimination of 95 percent of the 38,000 binding sites predicted upon analysis of the human sequence alone, while it identified 87 percent of the experimentally verified binding sites in this region.

  3. Strategies for enhancing the effectiveness of metagenomic-based enzyme discovery in lignocellulytic microbial communities

    Energy Technology Data Exchange (ETDEWEB)

    DeAngelis, K.M.; Gladden, J.G.; Allgaier, M.; D' haeseleer, P.; Fortney, J.L.; Reddy, A.; Hugenholtz, P.; Singer, S.W.; Vander Gheynst, J.; Silver, W.L.; Simmons, B.; Hazen, T.C.

    2010-03-01

    Producing cellulosic biofuels from plant material has recently emerged as a key U.S. Department of Energy goal. For this technology to be commercially viable on a large scale, it is critical to make production cost efficient by streamlining both the deconstruction of lignocellulosic biomass and fuel production. Many natural ecosystems efficiently degrade lignocellulosic biomass and harbor enzymes that, when identified, could be used to increase the efficiency of commercial biomass deconstruction. However, ecosystems most likely to yield relevant enzymes, such as tropical rain forest soil in Puerto Rico, are often too complex for enzyme discovery using current metagenomic sequencing technologies. One potential strategy to overcome this problem is to selectively cultivate the microbial communities from these complex ecosystems on biomass under defined conditions, generating less complex biomass-degrading microbial populations. To test this premise, we cultivated microbes from Puerto Rican soil or green waste compost under precisely defined conditions in the presence dried ground switchgrass (Panicum virgatum L.) or lignin, respectively, as the sole carbon source. Phylogenetic profiling of the two feedstock-adapted communities using SSU rRNA gene amplicon pyrosequencing or phylogenetic microarray analysis revealed that the adapted communities were significantly simplified compared to the natural communities from which they were derived. Several members of the lignin-adapted and switchgrass-adapted consortia are related to organisms previously characterized as biomass degraders, while others were from less well-characterized phyla. The decrease in complexity of these communities make them good candidates for metagenomic sequencing and will likely enable the reconstruction of a greater number of full length genes, leading to the discovery of novel lignocellulose-degrading enzymes adapted to feedstocks and conditions of interest.

  4. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

    OpenAIRE

    Steinfeld Israel; Navon Roy; Eden Eran; Lipson Doron; Yakhini Zohar

    2009-01-01

    Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on ...

  5. Transcriptome profiling for discovery of genes involved in shoot apical meristem and flower development

    OpenAIRE

    Singh, Vikash K.; Mukesh Jain

    2014-01-01

    Flower development is one of the major developmental processes that governs seed setting in angiosperms. However, little is known about the molecular mechanisms underlying flower development in legumes. Employing RNA-seq for various stages of flower development and few vegetative tissues in chickpea, we identified differentially expressed genes in flower tissues/stages in comparison to vegetative tissues, which are related to various biological processes and molecular functions during flower ...

  6. Sequence-based genotyping for marker discovery and co-dominant scoring in germplasm and populations.

    Science.gov (United States)

    Truong, Hoa T; Ramos, A Marcos; Yalcin, Feyruz; de Ruiter, Marjo; van der Poel, Hein J A; Huvenaars, Koen H J; Hogers, René C J; van Enckevort, Leonora J G; Janssen, Antoine; van Orsouw, Nathalie J; van Eijk, Michiel J T

    2012-01-01

    Conventional marker-based genotyping platforms are widely available, but not without their limitations. In this context, we developed Sequence-Based Genotyping (SBG), a technology for simultaneous marker discovery and co-dominant scoring, using next-generation sequencing. SBG offers users several advantages including a generic sample preparation method, a highly robust genome complexity reduction strategy to facilitate de novo marker discovery across entire genomes, and a uniform bioinformatics workflow strategy to achieve genotyping goals tailored to individual species, regardless of the availability of a reference sequence. The most distinguishing features of this technology are the ability to genotype any population structure, regardless whether parental data is included, and the ability to co-dominantly score SNP markers segregating in populations. To demonstrate the capabilities of SBG, we performed marker discovery and genotyping in Arabidopsis thaliana and lettuce, two plant species of diverse genetic complexity and backgrounds. Initially we obtained 1,409 SNPs for arabidopsis, and 5,583 SNPs for lettuce. Further filtering of the SNP dataset produced over 1,000 high quality SNP markers for each species. We obtained a genotyping rate of 201.2 genotypes/SNP and 58.3 genotypes/SNP for arabidopsis (n = 222 samples) and lettuce (n = 87 samples), respectively. Linkage mapping using these SNPs resulted in stable map configurations. We have therefore shown that the SBG approach presented provides users with the utmost flexibility in garnering high quality markers that can be directly used for genotyping and downstream applications. Until advances and costs will allow for routine whole-genome sequencing of populations, we expect that sequence-based genotyping technologies such as SBG will be essential for genotyping of model and non-model genomes alike. PMID:22662172

  7. Kernel-based Conditional Independence Test and Application in Causal Discovery

    CERN Document Server

    Zhang, Kun; Janzing, Dominik; Schoelkopf, Bernhard

    2012-01-01

    Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.

  8. Novel Technology for Protein-Protein Interaction-based Targeted Drug Discovery

    Directory of Open Access Journals (Sweden)

    Jung Me Hwang

    2011-12-01

    Full Text Available We have developed a simple but highly efficient in-cell protein-protein interaction (PPI discovery system based on the translocation properties of protein kinase C- and its C1a domain in live cells. This system allows the visual detection of trimeric and dimeric protein interactions including cytosolic, nuclear, and/or membrane proteins with their cognate ligands. In addition, this system can be used to identify pharmacological small compounds that inhibit specific PPIs. These properties make this PPI system an attractive tool for screening drug candidates and mapping the protein interactome.

  9. AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery

    International Nuclear Information System (INIS)

    New software has been developed for automating the experimental and data-processing stages of fragment-based drug discovery at a macromolecular crystallography beamline. A new workflow-automation framework orchestrates beamline-control and data-analysis software while organizing results from multiple samples. AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data, performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully

  10. Structure Based Discovery of Small Molecules to Regulate the Activity of Human Insulin Degrading Enzyme

    OpenAIRE

    Bilal Çakir; Onur Dağliyan; Ezgi Dağyildiz; İbrahim Bariş; Ibrahim Halil Kavakli; Seda Kizilel; Metin Türkay

    2012-01-01

    Structure Based Discovery of Small Molecules to Regulate the Activity of Human Insulin Degrading Enzyme Bilal C¸ akir1, Onur Dag˘ liyan1, Ezgi Dag˘ yildiz1, I˙brahim Baris¸1, Ibrahim Halil Kavakli1,2*, Seda Kizilel1*, Metin Tu¨ rkay3* 1 Department of Chemical and Biological Engineering, Koc¸ University, Sariyer, Istanbul, Turkey, 2 Department of Molecular Biology and Genetics, Koc¸ University, Sariyer, Istanbul, Turkey, 3 Department of Industrial Engineering, Koc¸ University...

  11. AutoDrug: fully automated macromolecular crystallography workflows for fragment-based drug discovery

    Energy Technology Data Exchange (ETDEWEB)

    Tsai, Yingssu [Stanford University, 2575 Sand Hill Road, Menlo Park, CA 94025 (United States); Stanford University, 333 Campus Drive, Mudd Building, Stanford, CA 94305-5080 (United States); McPhillips, Scott E.; González, Ana; McPhillips, Timothy M. [Stanford University, 2575 Sand Hill Road, Menlo Park, CA 94025 (United States); Zinn, Daniel [LogicBlox Inc., 1349 West Peachtree Street NW, Atlanta, GA 30309 (United States); Cohen, Aina E. [Stanford University, 2575 Sand Hill Road, Menlo Park, CA 94025 (United States); Feese, Michael D.; Bushnell, David [Cocrystal Discovery Inc., 19805 North Creek Parkway, Bothell, WA 98011 (United States); Tiefenbrunn, Theresa; Stout, C. David [The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (United States); Ludaescher, Bertram [University of California, One Shields Avenue, Davis, CA 95616 (United States); Hedman, Britt; Hodgson, Keith O. [Stanford University, 2575 Sand Hill Road, Menlo Park, CA 94025 (United States); Stanford University, 333 Campus Drive, Mudd Building, Stanford, CA 94305-5080 (United States); Soltis, S. Michael, E-mail: soltis@slac.stanford.edu [Stanford University, 2575 Sand Hill Road, Menlo Park, CA 94025 (United States)

    2013-05-01

    New software has been developed for automating the experimental and data-processing stages of fragment-based drug discovery at a macromolecular crystallography beamline. A new workflow-automation framework orchestrates beamline-control and data-analysis software while organizing results from multiple samples. AutoDrug is software based upon the scientific workflow paradigm that integrates the Stanford Synchrotron Radiation Lightsource macromolecular crystallography beamlines and third-party processing software to automate the crystallography steps of the fragment-based drug-discovery process. AutoDrug screens a cassette of fragment-soaked crystals, selects crystals for data collection based on screening results and user-specified criteria and determines optimal data-collection strategies. It then collects and processes diffraction data, performs molecular replacement using provided models and detects electron density that is likely to arise from bound fragments. All processes are fully automated, i.e. are performed without user interaction or supervision. Samples can be screened in groups corresponding to particular proteins, crystal forms and/or soaking conditions. A single AutoDrug run is only limited by the capacity of the sample-storage dewar at the beamline: currently 288 samples. AutoDrug was developed in conjunction with RestFlow, a new scientific workflow-automation framework. RestFlow simplifies the design of AutoDrug by managing the flow of data and the organization of results and by orchestrating the execution of computational pipeline steps. It also simplifies the execution and interaction of third-party programs and the beamline-control system. Modeling AutoDrug as a scientific workflow enables multiple variants that meet the requirements of different user groups to be developed and supported. A workflow tailored to mimic the crystallography stages comprising the drug-discovery pipeline of CoCrystal Discovery Inc. has been deployed and successfully

  12. Inside back cover: Biomarker discovery in mass spectrometry-based urinary proteomics.

    Science.gov (United States)

    Thomas, Samuel; Hao, Ling; Ricke, William A; Li, Lingjun

    2016-04-01

    DOI: 10.1002/prca.201500102 Urine is among the most valuable sample materials for studies of human diseases. These urine solutes are shown with increasing approximate diameter from metabolite at 1 nm to protein at 5 nm to a group of exosomes at 100 nm each to a cell at 10 000 nm. This article highlights promising technologies and strategies in the mass spectrometry-based urine proteomics and its application to disease biomarker discovery. Further details can be found in the article by Samuel Thomas et al. on page 358. PMID:27061328

  13. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    Energy Technology Data Exchange (ETDEWEB)

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  14. Novel Gene Discovery of Crops in China: Status, Challenging, and Perspective%中国作物新基因发掘:现状、挑战与展望

    Institute of Scientific and Technical Information of China (English)

    邱丽娟; 王建康; 万建民; 郭勇; 黎裕; 王晓波; 周国安; 刘章雄; 周时荣; 李新海; 马有志

    2011-01-01

    a gene level and hence for molecular breeding.This paper reviewed progress of novel gene discovery studies in major crops, such as rice, wheat, maize, soybean, cotton, and oilseed rape in China.In last decade, Chinese scientists have achieved a number of breakthroughs on novel gene identification in crops, including: (1) Various distinctive materials for gene discovery were created, such as core collections of germplasms based on crop genetic diversity, establishment of genetic populations based on genetic resources with favorite traits, assessment of mutants derived from mutagenesis, and so on; (2) Technology and methods of gene discovery were further developed, especially the gene-based integration of various discovery technologies with combination of biometric algorithm improvement of gene/QTLs, and therefore the efficiency of gene discovery was increased; (3) Mapping genes/QTLs related to important agronomic traits of crops has become a common method for genetic studies.A number of genes/QTLs associated with disease and insect resistance, stress tolerance, good quality, nutrient use efficiency and high yield have been mapped, of which more than 500 genes have been positioned on chromosomes precisely by fine mapping; (4) Great progress in cloning and functional analysis of crop genes in China, particularly in rice, has drawn world-wide attention.More than 300 genes have been cloned in the main crops, among which more than 70 genes have been functionally validated in crops.While gene discovery in crops becomes more and more efficient, large-scale and towards utilization in the world, Chinese scientists are also making new findings in this field.However, the quality and quantity of crop gene discovery in China is still far from satisfying the needs for molecular breeding and the overall level of novel gene discovery is still behind top labs/institutions in the world.Gene discovery in different crops has developed unevenly, the number of genes discovered is not

  15. Automated conserved noncoding sequence (CNS discovery reveals differences in gene content and promoter evolution among grasses

    Directory of Open Access Journals (Sweden)

    Gina eTurco

    2013-07-01

    Full Text Available Conserved noncoding sequences (CNS are islands of noncoding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several of CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searchers for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 KB of noncoding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium and maize.

  16. AMHC: Adaptive Multi-Hop Clustering based Resource Discovery Architecture for Large Scale MANETs

    Directory of Open Access Journals (Sweden)

    Saad Al-Ahmadi

    2014-05-01

    Full Text Available In this study we propose an efficient clustering protocol called AMHC used for resource discovery in large scale Mobile Ad hoc Networks (MANETs. AMHC is an Adaptive Multi-Hop Clustering generating several non-overlapping network localities (clusters with explicit elected cluster-heads. Every cluster member is on average d hops away from its cluster-head, where d is an integer parameter for the protocol. The generated set of clusters are highly stable and has low restructuring frequency that takes into consideration the dynamic network topology due to nodes mobility and depleted energy. The head election process is a distributed process based on a node’s weight formula calculated by every node independently. The node’s weight involves the current energy level, the current neighborhood degree and distance (in number of hops between the nominated head and the voting node. The cluster-head is responsible of coordinating intra-cluster and inter-cluster resource discovery activities. Inter-cluster communication is handled through gateway nodes which hear from more than one cluster and able to connect clusters with each other. The aim of AMHC is to identify all the possible gateways for creating highly fault-tolerant architecture. AMHC is an asynchronous, scalable and robust architecture capable of handling large amount of resource queries with high degree of power and communication efficiency. We conducted a comparative study using simulation to demonstrate AMHC’s efficiency and superiority against other recently proposed clustering algorithms in the literature. The comparison is based on: number of generated clusters, average cluster size, cluster stability and nodes re-affiliation. These results show a lot of promise for AMHC as efficient, energy-aware, load-balance and fault tolerant resource discovery architecture for large-scale MANETs.

  17. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  18. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

    Directory of Open Access Journals (Sweden)

    Hahn Daniel A

    2009-05-01

    Full Text Available Abstract Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (ES. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential microsatellite and SNP markers that could be used in population and systematic

  19. The Analysis of Image Segmentation Hierarchies with a Graph-based Knowledge Discovery System

    Science.gov (United States)

    Tilton, James C.; Cooke, diane J.; Ketkar, Nikhil; Aksoy, Selim

    2008-01-01

    Currently available pixel-based analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. A general consensus is that object-based image analysis (OBIA) is required to effectively analyze this type of data. OBIA is usually a two-stage process; image segmentation followed by an analysis of the segmented objects. We are exploring an approach to OBIA in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software developed at NASA GSFC are analyzed by the Subdue graph-based knowledge discovery system developed by a team at Washington State University. In this paper we discuss out initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and provide results on real and simulated data. We also discuss planned improvements designed to more effectively and completely convey the hierarchical segmentation information to Subdue and to improve processing efficiency.

  20. Application of SMILES Notation Based Optimal Descriptors in Drug Discovery and Design.

    Science.gov (United States)

    Veselinović, Aleksandar M; Veselinović, Jovana B; Živković, Jelena V; Nikolić, Goran M

    2015-01-01

    SMILES notation based optimal descriptors as a universal tool for the QSAR analysis with further application in drug discovery and design is presented. The basis of this QSAR modeling is Monte Carlo method which has important advantages over other methods, like the possibility of analysis of a QSAR as a random event, is discussed. The advantages of SMILES notation based optimal descriptors in comparison to commonly used descriptors are defined. The published results of QSAR modeling with SMILES notation based optimal descriptors applied for various pharmacologically important endpoints are listed. The presented QSAR modeling approach obeys OECD principles and has mechanistic interpretation with possibility to identify molecular fragments that contribute in positive and negative way to studied biological activity, what is of big importance in computer aided drug design of new compounds with desired activity. PMID:25961525

  1. Developing a distributed HTML5-based search engine for geospatial resource discovery

    Science.gov (United States)

    ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

    2013-12-01

    With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).

  2. AGENTS AND OWL-S BASED SEMANTIC WEB SERVICE DISCOVERY WITH USER PREFERENCE SUPPORT

    Directory of Open Access Journals (Sweden)

    Rohallah Benaboud

    2013-05-01

    Full Text Available Service-oriented computing (SOC is an interdisciplinary paradigm that revolutionizes the very fabric ofdistributed software development applications that adopt service-oriented architectures (SOA can evolveduring their lifespan and adapt to changing or unpredictable environments more easily. SOA is builtaround the concept of Web Services. Although the Web services constitute a revolution in Word Wide Web,they are always regarded as non-autonomous entities and can be exploited only after their discovery. Withthe help of software agents, Web services are becoming more efficient and more dynamic.The topic of this paper is the development of an agent based approach for Web services discovery andselection in witch, OWL-S is used to describe Web services, QoS and service customer request. We developan efficient semantic service matching which takes into account concepts properties to match concepts inWeb service and service customer request descriptions. Our approach is based on an architecturecomposed of four layers: Web service and Request description layer, Functional match layer, QoScomputing layer and Reputation computing layer.

  3. Gene expression module-based chemical function similarity search

    OpenAIRE

    Li, Yun; Hao, Pei; Zheng, Siyuan; Tu, Kang; Fan, Haiwei; Zhu, Ruixin; Ding, Guohui; Dong, Changzheng; Wang, Chuan; Li, Xuan; Thiesen, H.-J.; Chen, Y. Eugene; Jiang, HuaLiang; Liu, Lei; Li, Yixue

    2008-01-01

    Investigation of biological processes using selective chemical interventions is generally applied in biomedical research and drug discovery. Many studies of this kind make use of gene expression experiments to explore cellular responses to chemical interventions. Recently, some research groups constructed libraries of chemical related expression profiles, and introduced similarity comparison into chemical induced transcriptome analysis. Resembling sequence similarity alignment, expression pat...

  4. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    Science.gov (United States)

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    -membrane transport of lignin, cutin, and suberin precursors suggests that the transport process could also affect the composition of exocarp and take a role in the regulation of exocarp pigmentation. Results from this study provide a base for the analysis of the molecular mechanism underlying sand pear russet/green exocarp mutation, and presents a comprehensive list of candidate genes that could be used to further investigate the trait mutation at the molecular level. PMID:24445590

  5. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    .8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.

  6. Developing computer-based training programs for basic mammalian histology: Didactic versus discovery-based design

    Science.gov (United States)

    Fabian, Henry Joel

    Educators have long tried to understand what stimulates students to learn. The Swiss psychologist and zoologist, Jean Claude Piaget, suggested that students are stimulated to learn when they attempt to resolve confusion. He reasoned that students try to explain the world with the knowledge they have acquired in life. When they find their own explanations to be inadequate to explain phenomena, students find themselves in a temporary state of confusion. This prompts students to seek more plausible explanations. At this point, students are primed for learning (Piaget 1964). The Piagetian approach described above is called learning by discovery. To promote discovery learning, a teacher must first allow the student to recognize his misconception and then provide a plausible explanation to replace that misconception (Chinn and Brewer 1993). One application of this method is found in the various learning cycles, which have been demonstrated to be effective means for teaching science (Renner and Lawson 1973, Lawson 1986, Marek and Methven 1991, and Glasson & Lalik 1993). In contrast to the learning cycle, tutorial computer programs are generally not designed to correct student misconceptions, but rather follow a passive, didactic method of teaching. In the didactic or expositional method, the student is told about a phenomenon, but is neither encouraged to explore it, nor explain it in his own terms (Schneider and Renner 1980).

  7. Key Object Discovery and Tracking Based on Context-Aware Saliency

    Directory of Open Access Journals (Sweden)

    Geng Zhang

    2013-01-01

    Full Text Available In this paper, we propose an online key object discovery and tracking system based on visual saliency. We formulate the problem as a temporally consistent binary labelling task on a conditional random field and solve it by using a particle filter. We also propose a context‐aware saliency measurement, which can be used to improve the accuracy of any static or dynamic saliency maps. Our refined saliency maps provide clearer indications as to where the key object lies. Based on good saliency cues, we can further segment the key object inside the resulting bounding box, considering the spatial and temporal context. We tested our system extensively on different video clips. The results show that our method has significantly improved the saliency maps and tracks the key object accurately.

  8. A Chord-based resource scheduling approach in drug discovery grid

    Institute of Scientific and Technical Information of China (English)

    Chen Shudong; Zhang Wenju; Zhang Jun; Ma Fanyuan; Shen Jianhua

    2007-01-01

    This paper presents a resource scheduling approach in grid computing environment. Using P2P technology, this novel approach call schedule dynamic grid computing resources efficiently. Grid computing resources in different domains are organized into a structured P2P overlay network. Available resource information is published in type of grid services. Task requests for computational resources are also presented aS grid services. Problem of resources scheduling is translated into services discovery. Different from central scheduling approaches that collect available resources information, this Chord-based approach forwards task requests in the overlay network and discovers satisfied resources for these tasks. Using this approach, the computational resources of a grid system can be scheduled dynamically according to the real-time workload on each peer. Furthermore, the application of this approach is introduced into DDG, a grid system for drug discovery and design, to evaluate the performance. Experimental results show that computational resources of a grid system can be managed efficiently, and the system can hold a perfect load balance state and robustness.

  9. Mass spectrometry based translational proteomics for biomarker discovery and application in colorectal cancer.

    Science.gov (United States)

    Ma, Hong; Chen, Guilin; Guo, Mingquan

    2016-04-01

    Colorectal cancer (CRC) is a leading cause of cancer-related death in the world. Clinically, early detection of the disease is the most effective approach to tackle this tough challenge. Discovery and development of reliable and effective diagnostic tools for the assessment of prognosis and prediction of response to drug therapy are urgently needed for personalized therapies and better treatment outcomes. Among many ongoing efforts in search for potential CRC biomarkers, MS-based translational proteomics provides a unique opportunity for the discovery and application of protein biomarkers toward better CRC early detection and treatment. This review updates most recent studies that use preclinical models and clinical materials for the identification of CRC-related protein markers. Some new advances in the development of CRC protein markers such as CRC stem cell related protein markers, SRM/MRM-MS and MS cytometry approaches are also discussed in order to address future directions and challenges from bench translational research to bedside clinical application of CRC biomarkers. PMID:26616366

  10. A joint modeling approach for uncovering associations between gene expression, bioactivity and chemical structure in early drug discovery to guide lead selection and genomic biomarker development.

    Science.gov (United States)

    Perualila-Tan, Nolen; Kasim, Adetayo; Talloen, Willem; Verbist, Bie; Göhlmann, Hinrich W H; Shkedy, Ziv

    2016-08-01

    The modern drug discovery process involves multiple sources of high-dimensional data. This imposes the challenge of data integration. A typical example is the integration of chemical structure (fingerprint features), phenotypic bioactivity (bioassay read-outs) data for targets of interest, and transcriptomic (gene expression) data in early drug discovery to better understand the chemical and biological mechanisms of candidate drugs, and to facilitate early detection of safety issues prior to later and expensive phases of drug development cycles. In this paper, we discuss a joint model for the transcriptomic and the phenotypic variables conditioned on the chemical structure. This modeling approach can be used to uncover, for a given set of compounds, the association between gene expression and biological activity taking into account the influence of the chemical structure of the compound on both variables. The model allows to detect genes that are associated with the bioactivity data facilitating the identification of potential genomic biomarkers for compounds efficacy. In addition, the effect of every structural feature on both genes and pIC50 and their associations can be simultaneously investigated. Two oncology projects are used to illustrate the applicability and usefulness of the joint model to integrate multi-source high-dimensional information to aid drug discovery. PMID:27269248

  11. Gene cloning based on long oligonucleotide probes

    International Nuclear Information System (INIS)

    The most commonly used technique for gene cloning has been to utilize oligonucleotide probe based on protein sequence data. Of course this approach requires characterized and purified protein so that at least a portion of amino acid sequence can be determined and used to infer the corresponding DNA sequence. Based on the amino acid sequence information, either short or long oligonucleotide probes can be synthesized chemically. Long probes are typically 30-100 nucleotides long and are a single sequence based on a best guess for each codon. The long probe approach was first used to screen for three different genes: bovine trypsin inhibitor, human insulin-like growth factor I, and human factor IX. There are three advantages of long probes. (1) Any stretch of amino acid sequence 10 or longer can be used. (2) The amino acid sequence need not be absolutely correct. (3) These probes can be used to screen high-complexity libraries with fewer false positives. In spite of the uncertainties over codon selection, the long probe approach is currently the method of choice in screening for genes based on protein sequence data

  12. Optimal search-based gene subset selection for gene array cancer classification.

    Science.gov (United States)

    Li, Jiexun; Su, Hua; Chen, Hsinchun; Futscher, Bernard W

    2007-07-01

    High dimensionality has been a major problem for gene array-based cancer classification. It is critical to identify marker genes for cancer diagnoses. We developed a framework of gene selection methods based on previous studies. This paper focuses on optimal search-based subset selection methods because they evaluate the group performance of genes and help to pinpoint global optimal set of marker genes. Notably, this paper is the first to introduce tabu search (TS) to gene selection from high-dimensional gene array data. Our comparative study of gene selection methods demonstrated the effectiveness of optimal search-based gene subset selection to identify cancer marker genes. TS was shown to be a promising tool for gene subset selection. PMID:17674622

  13. Discovery of Potent Myeloid Cell Leukemia 1 (Mcl-1) Inhibitors Using Fragment-Based Methods and Structure-Based Design

    Energy Technology Data Exchange (ETDEWEB)

    Friberg, Anders [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Vigil, Dominico [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Zhao, Bin [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Daniels, R. Nathan [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Burke, Jason P. [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Garcia-Barrantes, Pedro M. [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Camper, DeMarco [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Chauder, Brian A. [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Lee, Taekyu [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Olejniczak, Edward T. [Vanderbilt Univ. School of Medicine, Nashville, TN (United States); Fesik, Stephen W. [Vanderbilt Univ. School of Medicine, Nashville, TN (United States)

    2012-12-17

    Myeloid cell leukemia 1 (Mcl-1), a member of the Bcl-2 family of proteins, is overexpressed and amplified in various cancers and promotes the aberrant survival of tumor cells that otherwise would undergo apoptosis. Here we describe the discovery of potent and selective Mcl-1 inhibitors using fragment-based methods and structure-based design. NMR-based screening of a large fragment library identified two chemically distinct hit series that bind to different sites on Mcl-1. Members of the two fragment classes were merged together to produce lead compounds that bind to Mcl-1 with a dissociation constant of <100 nM with selectivity for Mcl-1 over Bcl-xL and Bcl-2. Structures of merged compounds when complexed to Mcl-1 were obtained by X-ray crystallography and provide detailed information about the molecular recognition of small-molecule ligands binding Mcl-1. The compounds represent starting points for the discovery of clinically useful Mcl-1 inhibitors for the treatment of a wide variety of cancers.

  14. REALIZING THE NEED FOR SIMILARITY BASED REASONING OF CLOUD SERVICE DISCOVERY

    Directory of Open Access Journals (Sweden)

    S. BHAMA

    2011-12-01

    Full Text Available With the growing abundance of information on the web, it becomes the need of the hour to enrich data with semantics that can be understood and processed by machines. Currently, much of the effort in the area of semantics is focused on the representation of semantic data and its reasoning, which is the processing of semantic information associated with that data. This paper aims at realizing the need for similarity based reasoning of cloud service discovery. It forms a basic requirement of a cloud client to discover the most appropriate cloud service from the list of available services published by service providers. Cloud ontology provides a set of concepts, individuals and relationships among them. The similarity among cloud services can be determined from the semantic similarity of concepts and hence the relevant service can be retrieved.

  15. Discovery of pyrrole-based hepatoselective ligands as potent inhibitors of HMG-CoA reductase.

    Science.gov (United States)

    Bratton, Larry D; Auerbach, Bruce; Choi, Chulho; Dillon, Lisa; Hanselman, Jeffrey C; Larsen, Scott D; Lu, Gina; Olsen, Karl; Pfefferkorn, Jeffrey A; Robertson, Andrew; Sekerke, Catherine; Trivedi, Bharat K; Unangst, Paul C

    2007-08-15

    In an effort to identify hepatoselective inhibitors of HMG-CoA reductase, two series of pyrroles were synthesized and evaluated. Efforts were made to modify (3R,5R)-7-[3-(4-fluorophenyl)-1-isopropyl-4-phenyl-5-phenylcarbamoyl-1H-pyrrol-2-yl]-3,5-dihydroxy-heptanoic acid sodium salt 30 in order to reduce its lipophilicity and therefore increase hepatoselectivity. Two strategies that were explored were replacement of the lipophilic 3-phenyl substituent with either a polar function (pyridyl series) or with lower alkyl substituents (lower alkyl series) and attachment of additional polar moieties at the 2-position of the pyrrole ring. One compound was identified to be both highly hepatoselective and active in vivo. We report the discovery, synthesis, and optimization of substituted pyrrole-based hepatoselective ligands as potent inhibitors of HMG-CoA reductase for reducing low density lipoprotein cholesterol (LDL-c) in the treatment of hypercholesterolemia. PMID:17560788

  16. Droplet-based microfluidics in drug discovery, transcriptomics and high-throughput molecular genetics.

    Science.gov (United States)

    Shembekar, Nachiket; Chaipan, Chawaree; Utharala, Ramesh; Merten, Christoph A

    2016-04-12

    Droplet-based microfluidics enables assays to be carried out at very high throughput (up to thousands of samples per second) and enables researchers to work with very limited material, such as primary cells, patient's biopsies or expensive reagents. An additional strength of the technology is the possibility to perform large-scale genotypic or phenotypic screens at the single-cell level. Here we critically review the latest developments in antibody screening, drug discovery and highly multiplexed genomic applications such as targeted genetic workflows, single-cell RNAseq and single-cell ChIPseq. Starting with a comprehensive introduction for non-experts, we pinpoint current limitations, analyze how they might be overcome and give an outlook on exciting future applications. PMID:27025767

  17. A review of Fuzzy Based QoS Web Service Discovery

    Directory of Open Access Journals (Sweden)

    R.Buvanesvari

    2013-03-01

    Full Text Available Recently, web service has become an important issue for developers. Selecting a specific service is a crucial task. Some approaches develop extensive description and publication mechanisms while others use syntactic, semantic, and structural reviews of Web service specifications. It is very crucial for finding the most suitable web service from a large collection of web services for successful execution of applications. In many cases, the value of a QoS property may not be precisely defined. Recently, fuzzy is considered as the dominant approaches in Web services which can deal with fuzzy constraints have been proposed. Therefore fuzzy logic can be applied to support for representing such imprecise QoS constraints. In this paper, we will present an overview which focus on developing fuzzy-based approach for Web service discovery. This paper also describes the web service challenges on fuzzy mechanism that summarized and analyzed in order to assess their benefits and limitations.

  18. Dynamic Structure-Based Pharmacophore Model Development: A New and Effective Addition in the Histone Deacetylase 8 (HDAC8 Inhibitor Discovery

    Directory of Open Access Journals (Sweden)

    Keun Woo Lee

    2011-12-01

    Full Text Available Histone deacetylase 8 (HDAC8 is an enzyme involved in deacetylating the amino groups of terminal lysine residues, thereby repressing the transcription of various genes including tumor suppressor gene. The over expression of HDAC8 was observed in many cancers and thus inhibition of this enzyme has emerged as an efficient cancer therapeutic strategy. In an effort to facilitate the future discovery of HDAC8 inhibitors, we developed two pharmacophore models containing six and five pharmacophoric features, respectively, using the representative structures from two molecular dynamic (MD simulations performed in Gromacs 4.0.5 package. Various analyses of trajectories obtained from MD simulations have displayed the changes upon inhibitor binding. Thus utilization of the dynamically-responded protein structures in pharmacophore development has the added advantage of considering the conformational flexibility of protein. The MD trajectories were clustered based on single-linkage method and representative structures were taken to be used in the pharmacophore model development. Active site complimenting structure-based pharmacophore models were developed using Discovery Studio 2.5 program and validated using a dataset of known HDAC8 inhibitors. Virtual screening of chemical database coupled with drug-like filter has identified drug-like hit compounds that match the pharmacophore models. Molecular docking of these hits reduced the false positives and identified two potential compounds to be used in future HDAC8 inhibitor design.

  19. Evidence-based gene predictions in plant genomes

    Science.gov (United States)

    Automated evidence-based gene building is a rapid and cost-effective way to provide reliable gene annotations on newly sequenced genomes. One of the limitations of evidence-based gene builders, however, is their requirement for gene expression evidence—known proteins, full-length cDNAs, or expressed...

  20. Comparative transcriptome analysis of testes and ovaries for the discovery of novel genes from Amur sturgeon (Acipenser schrenckii).

    Science.gov (United States)

    Jin, S B; Zhang, Y; Dong, X L; Xi, Q K; Song, D; Fu, H T; Sun, D J

    2015-01-01

    Sturgeons (Acipenser schrenckii) are of high evolutionary, economic, and conservation value, and caviar isone of the most valuable animal food products in the world. The Illumina HiSeq2000 sequencing platform was used to construct testicular and ovarian transcriptomes to identify genes involved in reproduction and sex determination in A. schrenckii. A total of 122,381 and 114,527 unigenes were obtained in the testicular and ovarian transcriptomes, respectively, with average lengths of 748 and 697 bp. A total of 46,179 genes were matched to the non-redundant nr database. GO (31,266), KEGG (39,712), and COG analyses (20,126) were performed to identify potential genes and their functions. Twenty-six gene families involved in reproduction and sex determination were identified from the A. schrenckii testicular and ovarian transcriptomes based on functional annotation of non-redundant transcripts and comparisons with the published literature. Furthermore, 1309 unigenes showed significant differences between the testes and ovaries, including 782 genes that were up-regulated in the testes and 527 that were up-regulated in the ovaries. Eleven genes were involved in reproduction and sex determination mechanisms. Furthermore, 19,065 simple sequence repeats (SSRs) were identified in the expressed sequence tagged dataset, and 190,863 and 193,258 single nucleotide polymorphisms (SNPs) were obtained from the testicular and ovarian transcriptomic databases, respectively. This study provides new sequence information about A. schrenckii, which will provide a basis for the further study of reproduction and sex determination mechanisms in Acipenser species. The potential SSR and SNP markers isolated from the transcriptome may shed light on the evolution and molecular ecology of Acipenser species. PMID:26782541

  1. KBERG: KnowledgeBase for Estrogen Responsive Genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Zhang, Zhuo; Tan, Sin Lam;

    2007-01-01

    Estrogen has a profound impact on human physiology affecting transcription of numerous genes. To decipher functional characteristics of estrogen responsive genes, we developed KnowledgeBase for Estrogen Responsive Genes (KBERG). Genes in KBERG were derived from Estrogen Responsive Gene Database...... user-friendly system that provides links to other relevant resources such as ERGDB, UniGene, Entrez Gene, HomoloGene, GO, eVOC and GenBank, and thus offers a platform for functional exploration and potential annotation of genes responsive to estrogen. KBERG database can be accessed at http...

  2. Evidence Based Selection of Housekeeping Genes

    OpenAIRE

    de Jonge, Hendrik J.M.; Fehrmann, Rudolf S. N.; Eveline S. J. M. de Bont; Hofstra, Robert M. W.; Gerbens, Frans; Kamps, Willem A.; Vries, Elisabeth G. E.; van der Zee, Ate G.J.; te Meerman, Gerard J.; ter Elst, Arja

    2007-01-01

    For accurate and reliable gene expression analysis, normalization of gene expression data against housekeeping genes (reference or internal control genes) is required. It is known that commonly used housekeeping genes (e.g. ACTB, GAPDH, HPRT1, and B2M) vary considerably under different experimental conditions and therefore their use for normalization is limited. We performed a meta-analysis of 13,629 human gene array samples in order to identify the most stable expressed genes. Here we show n...

  3. Discovery of a Series of Acridinones as Mechanism-Based Tubulin Assembly Inhibitors with Anticancer Activity

    Science.gov (United States)

    Magalhaes, Luma G.; Marques, Fernando B.; da Fonseca, Marina B.; Rogério, Kamilla R.; Graebin, Cedric S.; Andricopulo, Adriano D.

    2016-01-01

    Microtubules play critical roles in vital cell processes, including cell growth, division, and migration. Microtubule-targeting small molecules are chemotherapeutic agents that are widely used in the treatment of cancer. Many of these compounds are structurally complex natural products (e.g., paclitaxel, vinblastine, and vincristine) with multiple stereogenic centers. Because of the scarcity of their natural sources and the difficulty of their partial or total synthesis, as well as problems related to their bioavailability, toxicity, and resistance, there is an urgent need for novel microtubule binding agents that are effective for treating cancer but do not have these disadvantages. In the present work, our lead discovery effort toward less structurally complex synthetic compounds led to the discovery of a series of acridinones inspired by the structure of podophyllotoxin, a natural product with important microtubule assembly inhibitory activity, as novel mechanism-based tubulin assembly inhibitors with potent anticancer properties and low toxicity. The compounds were evaluated in vitro by wound healing assays employing the metastatic and triple negative breast cancer cell line MDA-MB-231. Four compounds with IC50 values between 0.294 and 1.7 μM were identified. These compounds showed selective cytotoxicity against MDA-MB-231 and DU-145 cancer cell lines and promoted cell cycle arrest in G2/M phase and apoptosis. Consistent with molecular modeling results, the acridinones inhibited tubulin assembly in in vitro polymerization assays with IC50 values between 0.9 and 13 μM. Their binding to the colchicine-binding site of tubulin was confirmed through competitive assays. PMID:27508497

  4. Discovery of a Series of Acridinones as Mechanism-Based Tubulin Assembly Inhibitors with Anticancer Activity.

    Science.gov (United States)

    Magalhaes, Luma G; Marques, Fernando B; da Fonseca, Marina B; Rogério, Kamilla R; Graebin, Cedric S; Andricopulo, Adriano D

    2016-01-01

    Microtubules play critical roles in vital cell processes, including cell growth, division, and migration. Microtubule-targeting small molecules are chemotherapeutic agents that are widely used in the treatment of cancer. Many of these compounds are structurally complex natural products (e.g., paclitaxel, vinblastine, and vincristine) with multiple stereogenic centers. Because of the scarcity of their natural sources and the difficulty of their partial or total synthesis, as well as problems related to their bioavailability, toxicity, and resistance, there is an urgent need for novel microtubule binding agents that are effective for treating cancer but do not have these disadvantages. In the present work, our lead discovery effort toward less structurally complex synthetic compounds led to the discovery of a series of acridinones inspired by the structure of podophyllotoxin, a natural product with important microtubule assembly inhibitory activity, as novel mechanism-based tubulin assembly inhibitors with potent anticancer properties and low toxicity. The compounds were evaluated in vitro by wound healing assays employing the metastatic and triple negative breast cancer cell line MDA-MB-231. Four compounds with IC50 values between 0.294 and 1.7 μM were identified. These compounds showed selective cytotoxicity against MDA-MB-231 and DU-145 cancer cell lines and promoted cell cycle arrest in G2/M phase and apoptosis. Consistent with molecular modeling results, the acridinones inhibited tubulin assembly in in vitro polymerization assays with IC50 values between 0.9 and 13 μM. Their binding to the colchicine-binding site of tubulin was confirmed through competitive assays. PMID:27508497

  5. Discovery of Subtype Selective Janus Kinase (JAK) Inhibitors by Structure-Based Virtual Screening.

    Science.gov (United States)

    Bajusz, Dávid; Ferenczy, György G; Keserű, György M

    2016-01-25

    Janus kinase inhibitors represent a promising opportunity for the pharmaceutical intervention of various inflammatory and oncological indications. Subtype selective inhibition of these enzymes, however, is still a very challenging goal. In this study, a novel, customized virtual screening protocol was developed with the intention of providing an efficient tool for the discovery of subtype selective JAK2 inhibitors. The screening protocol involves protein ensemble-based docking calculations combined with an Interaction Fingerprint (IFP) based scoring scheme for estimating ligand affinities and selectivities, respectively. The methodology was validated in retrospective studies and was applied prospectively to screen a large database of commercially available compounds. Six compounds were identified and confirmed in vitro, with an indazole-based hit exhibiting promising selectivity for JAK2 vs JAK1. Having demonstrated that the described methodology is capable of identifying subtype selective chemical starting points with a favorable hit rate (11%), we believe that the presented screening concept can be useful for other kinase targets with challenging selectivity profiles. PMID:26682735

  6. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes

    Science.gov (United States)

    Ahvar, Ehsan; Lee, Gyu Myoung; Han, Son N.; Crespi, Noel; Khan, Imran

    2016-01-01

    User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD) will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i) users’ privacy; (ii) device-/tag-free; and (iii) fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation. PMID:27355951

  7. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes

    Directory of Open Access Journals (Sweden)

    Ehsan Ahvar

    2016-06-01

    Full Text Available User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i users’ privacy; (ii device-/tag-free; and (iii fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation.

  8. Sensor Network-Based and User-Friendly User Location Discovery for Future Smart Homes.

    Science.gov (United States)

    Ahvar, Ehsan; Lee, Gyu Myoung; Han, Son N; Crespi, Noel; Khan, Imran

    2016-01-01

    User location is crucial context information for future smart homes where many location based services will be proposed. This location necessarily means that User Location Discovery (ULD) will play an important role in future smart homes. Concerns about privacy and the need to carry a mobile or a tag device within a smart home currently make conventional ULD systems uncomfortable for users. Future smart homes will need a ULD system to consider these challenges. This paper addresses the design of such a ULD system for context-aware services in future smart homes stressing the following challenges: (i) users' privacy; (ii) device-/tag-free; and (iii) fault tolerance and accuracy. On the other hand, emerging new technologies, such as the Internet of Things, embedded systems, intelligent devices and machine-to-machine communication, are penetrating into our daily life with more and more sensors available for use in our homes. Considering this opportunity, we propose a ULD system that is capitalizing on the prevalence of sensors for the home while satisfying the aforementioned challenges. The proposed sensor network-based and user-friendly ULD system relies on different types of inexpensive sensors, as well as a context broker with a fuzzy-based decision-maker. The context broker receives context information from different types of sensors and evaluates that data using the fuzzy set theory. We demonstrate the performance of the proposed system by illustrating a use case, utilizing both an analytical model and simulation. PMID:27355951

  9. Design Process Optimization Based on Design Process Gene Mapping

    Institute of Scientific and Technical Information of China (English)

    LI Bo; TONG Shu-rong

    2011-01-01

    The idea of genetic engineering is introduced into the area of product design to improve the design efficiency. A method towards design process optimization based on the design process gene is proposed through analyzing the correlation between the design process gene and characteristics of the design process. The concept of the design process gene is analyzed and categorized into five categories that are the task specification gene, the concept design gene, the overall design gene, the detailed design gene and the processing design gene in the light of five design phases. The elements and their interactions involved in each kind of design process gene signprocess gene mapping is drawn with its structure disclosed based on its function that process gene.

  10. Genomic sequence-based discovery of novel angucyclinone antibiotics from marine Streptomyces sp. W007.

    Science.gov (United States)

    Zhang, Hongyu; Wang, Hongbo; Wang, Yipeng; Cui, Hongli; Xie, Zeping; Pu, Yang; Pei, Shiqian; Li, Fuchao; Qin, Song

    2012-07-01

    A large number of novel bioactive compounds were discovered from microbial secondary metabolites based on the traditional bioactivity screenings. Recent fermentation studies indicated that the crude extract of marine Streptomyces sp. W007 possessed great potential in agricultural fungal disease control against Phomopsis asparagi, Polystigma deformans, Cladosporium cucumerinum, Monilinia fructicola, and Colletotrichum lagenarium. To further evaluate the biosynthetic potential of secondary metabolites, we sequenced the genome of Streptomyces sp. W007 and analyzed the identifiable secondary metabolite gene clusters. Moreover, one gene cluster with type II PKS implied the possibility of Streptomyces sp. W007 to produce aromatic polyketide of angucyclinone antibiotics. Therefore, two novel compounds, 3-hydroxy-1-keto-3-methyl-8-methoxy-1,2,3,4-tetrahydro-benz[α]anthracene and kiamycin with potent cytotoxicities against human cancer cell lines, were isolated from the culture broth of Streptomyces sp. W007. In addition, other four known angucyclinone antibiotics were obtained. The gene cluster for these angucyclinone antibiotics could be assigned to 20 genes. This work provides powerful evidence for the interplay between genomic analysis and traditional natural product isolation research. PMID:22536997

  11. Linear Discriminant Analysis - Based Estimation of the False Discovery Rate for Phosphopeptide Identifications

    Energy Technology Data Exchange (ETDEWEB)

    Du, Xiuxia; Yang, Feng; Manes, Nathan P.; Stenoien, David L.; Monroe, Matthew E.; Adkins, Joshua N.; States, David J.; Purvine, Samuel O.; Camp, David G.; Smith, Richard D.

    2008-07-03

    This paper describes a method to estimate the False Discovery Rate (FDR) of phosphopeptide identifications. The method starts with a re-assignment of the phosphorylation site/sites to those phosphopeptides for which there exists an ambiguity in the original assignment of the phosphorylation site/sites. It then performs an online data training using Expectation Maximization to estimate the joint distribution underlying the observed search results of multiple parameters from search engines. A Linear Discriminant Analysis (LDA) is subsequently carried out to optimally combine the search results into a discriminant score that possesses the most discriminating power. Based on the discriminant score, the p-value and q-value (explain a little bit more) for each identified phosphopeptide are calculated and the FDR for the set of phosphopeptides which are claimed correct identifications can then be rigorously estimated based on its definition. The approach can be easily extended to estimate the FDR of unmodified peptides. The proposed approach has been applied to datasets from a study of the effect of high-dose radiation on human skin fibroblast cells. The data analysis procedure has been coded into a software package which is freely available.

  12. Computational Materials Science and Chemistry: Accelerating Discovery and Innovation through Simulation-Based Engineering and Science

    Energy Technology Data Exchange (ETDEWEB)

    Crabtree, George [Argonne National Lab. (ANL), Argonne, IL (United States); Glotzer, Sharon [University of Michigan; McCurdy, Bill [University of California Davis; Roberto, Jim [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2010-07-26

    This report is based on a SC Workshop on Computational Materials Science and Chemistry for Innovation on July 26-27, 2010, to assess the potential of state-of-the-art computer simulations to accelerate understanding and discovery in materials science and chemistry, with a focus on potential impacts in energy technologies and innovation. The urgent demand for new energy technologies has greatly exceeded the capabilities of today's materials and chemical processes. To convert sunlight to fuel, efficiently store energy, or enable a new generation of energy production and utilization technologies requires the development of new materials and processes of unprecedented functionality and performance. New materials and processes are critical pacing elements for progress in advanced energy systems and virtually all industrial technologies. Over the past two decades, the United States has developed and deployed the world's most powerful collection of tools for the synthesis, processing, characterization, and simulation and modeling of materials and chemical systems at the nanoscale, dimensions of a few atoms to a few hundred atoms across. These tools, which include world-leading x-ray and neutron sources, nanoscale science facilities, and high-performance computers, provide an unprecedented view of the atomic-scale structure and dynamics of materials and the molecular-scale basis of chemical processes. For the first time in history, we are able to synthesize, characterize, and model materials and chemical behavior at the length scale where this behavior is controlled. This ability is transformational for the discovery process and, as a result, confers a significant competitive advantage. Perhaps the most spectacular increase in capability has been demonstrated in high performance computing. Over the past decade, computational power has increased by a factor of a million due to advances in hardware and software. This rate of improvement, which shows no sign of

  13. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT).

    Science.gov (United States)

    Timbers, Tiffany A; Garland, Stephanie J; Mohan, Swetha; Flibotte, Stephane; Edgley, Mark; Muncaster, Quintin; Au, Vinci; Li-Leger, Erica; Rosell, Federico I; Cai, Jerry; Rademakers, Suzanne; Jansen, Gert; Moerman, Donald G; Leroux, Michel R

    2016-08-01

    Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia) to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS). WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy. PMID:27508411

  14. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT)

    Science.gov (United States)

    Garland, Stephanie J.; Mohan, Swetha; Flibotte, Stephane; Muncaster, Quintin; Cai, Jerry; Rademakers, Suzanne; Moerman, Donald G.; Leroux, Michel R.

    2016-01-01

    Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia) to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS). WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy. PMID:27508411

  15. Progress in Chimeric Vector and Chimeric Gene Based Cardiovascular Gene Therapy

    Institute of Scientific and Technical Information of China (English)

    HU Chun-Song; YOON Young-sup; ISNER Jeffrey M.; LOSORDO Douglas W.

    2003-01-01

    Gene therapy for cardiovascular diseases has developed from preliminary animal experiments to clinical trials. However, vectors and target genes used currently in gene therapy are mainly focused on viral, nonviral vector and single target gene or monogene. Each vector system has a series of advantages and limitations. Chimeric vectors which combine the advantages of viral and nonviral vector,chimeric target genes which combine two or more target genes and novel gene delivery modes are being developed. In this article, we summarized the progress in chimeric vectors and chimeric genes based cardiovascular gene therapy, which including proliferative or occlusive vascular diseases such as atheroslerosis and restenosis, hypertonic vascular disease such as hypertension and cardiac diseases such as myocardium ischemia, dilated cardiomyopathy and heart failure, even heart transplantation. The development of chimeric vector, chimeric gene and their cardiovascular gene therapy is promising.

  16. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Directory of Open Access Journals (Sweden)

    Wynhoven Brian

    2011-03-01

    Full Text Available Abstract Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt, we have generated Expressed Sequence Tags (ESTs by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores and asexual (germinated urediniospores stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum, 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs. Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt and stripe rust, P. striiformis f. sp

  17. Structure Based Discovery of Small Molecules to Regulate the Activity of Human Insulin Degrading Enzyme

    Science.gov (United States)

    Çakir, Bilal; Dağliyan, Onur; Dağyildiz, Ezgi; Bariş, İbrahim; Kavakli, Ibrahim Halil; Kizilel, Seda; Türkay, Metin

    2012-01-01

    Background Insulin-degrading enzyme (IDE) is an allosteric Zn+2 metalloprotease involved in the degradation of many peptides including amyloid-β, and insulin that play key roles in Alzheimer's disease (AD) and type 2 diabetes mellitus (T2DM), respectively. Therefore, the use of therapeutic agents that regulate the activity of IDE would be a viable approach towards generating pharmaceutical treatments for these diseases. Crystal structure of IDE revealed that N-terminal has an exosite which is ∼30 Å away from the catalytic region and serves as a regulation site by orientation of the substrates of IDE to the catalytic site. It is possible to find small molecules that bind to the exosite of IDE and enhance its proteolytic activity towards different substrates. Methodology/Principal Findings In this study, we applied structure based drug design method combined with experimental methods to discover four novel molecules that enhance the activity of human IDE. The novel compounds, designated as D3, D4, D6, and D10 enhanced IDE mediated proteolysis of substrate V, insulin and amyloid-β, while enhanced degradation profiles were obtained towards substrate V and insulin in the presence of D10 only. Conclusion/Significance This paper describes the first examples of a computer-aided discovery of IDE regulators, showing that in vitro and in vivo activation of this important enzyme with small molecules is possible. PMID:22355395

  18. Structure based discovery of small molecules to regulate the activity of human insulin degrading enzyme.

    Directory of Open Access Journals (Sweden)

    Bilal Çakir

    Full Text Available BACKGROUND: Insulin-degrading enzyme (IDE is an allosteric Zn(+2 metalloprotease involved in the degradation of many peptides including amyloid-β, and insulin that play key roles in Alzheimer's disease (AD and type 2 diabetes mellitus (T2DM, respectively. Therefore, the use of therapeutic agents that regulate the activity of IDE would be a viable approach towards generating pharmaceutical treatments for these diseases. Crystal structure of IDE revealed that N-terminal has an exosite which is ∼30 Å away from the catalytic region and serves as a regulation site by orientation of the substrates of IDE to the catalytic site. It is possible to find small molecules that bind to the exosite of IDE and enhance its proteolytic activity towards different substrates. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we applied structure based drug design method combined with experimental methods to discover four novel molecules that enhance the activity of human IDE. The novel compounds, designated as D3, D4, D6, and D10 enhanced IDE mediated proteolysis of substrate V, insulin and amyloid-β, while enhanced degradation profiles were obtained towards substrate V and insulin in the presence of D10 only. CONCLUSION/SIGNIFICANCE: This paper describes the first examples of a computer-aided discovery of IDE regulators, showing that in vitro and in vivo activation of this important enzyme with small molecules is possible.

  19. A magnetic bead-based ligand binding assay to facilitate human kynurenine 3-monooxygenase drug discovery.

    Science.gov (United States)

    Wilson, Kris; Mole, Damian J; Homer, Natalie Z M; Iredale, John P; Auer, Manfred; Webster, Scott P

    2015-02-01

    Human kynurenine 3-monooxygenase (KMO) is emerging as an important drug target enzyme in a number of inflammatory and neurodegenerative disease states. Recombinant protein production of KMO, and therefore discovery of KMO ligands, is challenging due to a large membrane targeting domain at the C-terminus of the enzyme that causes stability, solubility, and purification difficulties. The purpose of our investigation was to develop a suitable screening method for targeting human KMO and other similarly challenging drug targets. Here, we report the development of a magnetic bead-based binding assay using mass spectrometry detection for human KMO protein. The assay incorporates isolation of FLAG-tagged KMO enzyme on protein A magnetic beads. The protein-bound beads are incubated with potential binding compounds before specific cleavage of the protein-compound complexes from the beads. Mass spectrometry analysis is used to identify the compounds that demonstrate specific binding affinity for the target protein. The technique was validated using known inhibitors of KMO. This assay is a robust alternative to traditional ligand-binding assays for challenging protein targets, and it overcomes specific difficulties associated with isolating human KMO. PMID:25296660

  20. Gun possession among American youth: a discovery-based approach to understand gun violence.

    Directory of Open Access Journals (Sweden)

    Kelly V Ruggles

    Full Text Available OBJECTIVE: To apply discovery-based computational methods to nationally representative data from the Centers for Disease Control and Preventions' Youth Risk Behavior Surveillance System to better understand and visualize the behavioral factors associated with gun possession among adolescent youth. RESULTS: Our study uncovered the multidimensional nature of gun possession across nearly five million unique data points over a ten year period (2001-2011. Specifically, we automated odds ratio calculations for 55 risk behaviors to assemble a comprehensive table of associations for every behavior combination. Downstream analyses included the hierarchical clustering of risk behaviors based on their association "fingerprint" to 1 visualize and assess which behaviors frequently co-occur and 2 evaluate which risk behaviors are consistently found to be associated with gun possession. From these analyses, we identified more than 40 behavioral factors, including heroin use, using snuff on school property, having been injured in a fight, and having been a victim of sexual violence, that have and continue to be strongly associated with gun possession. Additionally, we identified six behavioral clusters based on association similarities: 1 physical activity and nutrition; 2 disordered eating, suicide and sexual violence; 3 weapon carrying and physical safety; 4 alcohol, marijuana and cigarette use; 5 drug use on school property and 6 overall drug use. CONCLUSIONS: Use of computational methodologies identified multiple risk behaviors, beyond more commonly discussed indicators of poor mental health, that are associated with gun possession among youth. Implications for prevention efforts and future interdisciplinary work applying computational methods to behavioral science data are described.

  1. Optimal design of cluster-based ad-hoc networks using probabilistic solution discovery

    International Nuclear Information System (INIS)

    The reliability of ad-hoc networks is gaining popularity in two areas: as a topic of academic interest and as a key performance parameter for defense systems employing this type of network. The ad-hoc network is dynamic and scalable and these descriptions are what attract its users. However, these descriptions are also synonymous for undefined and unpredictable when considering the impacts to the reliability of the system. The configuration of an ad-hoc network changes continuously and this fact implies that no single mathematical expression or graphical depiction can describe the system reliability-wise. Previous research has used mobility and stochastic models to address this challenge successfully. In this paper, the authors leverage the stochastic approach and build upon it a probabilistic solution discovery (PSD) algorithm to optimize the topology for a cluster-based mobile ad-hoc wireless network (MAWN). Specifically, the membership of nodes within the back-bone network or networks will be assigned in such as way as to maximize reliability subject to a constraint on cost. The constraint may also be considered as a non-monetary cost, such as weight, volume, power, or the like. When a cost is assigned to each component, a maximum cost threshold is assigned to the network, and the method is run; the result is an optimized allocation of the radios enabling back-bone network(s) to provide the most reliable network possible without exceeding the allowable cost. The method is intended for use directly as part of the architectural design process of a cluster-based MAWN to efficiently determine an optimal or near-optimal design solution. It is capable of optimizing the topology based upon all-terminal reliability (ATR), all-operating terminal reliability (AoTR), or two-terminal reliability (2TR)

  2. Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples

    Science.gov (United States)

    Azad, Ariful; Rajwa, Bartek; Pothen, Alex

    2016-01-01

    We describe algorithms for discovering immunophenotypes from large collections of flow cytometry samples and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations’ characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters), a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class) that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples while ignoring noise and small sample-specific variations. We have applied the template-based scheme to analyze several datasets, including one representing a healthy immune system and one of acute myeloid leukemia (AML) samples. The last task is challenging due to the phenotypic heterogeneity of the several subtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML and were able to distinguish acute promyelocytic leukemia (APL) samples with the markers provided. Clinically, this is helpful since APL has a different treatment regimen from other subtypes of AML. Core algorithms used in our data analysis are

  3. Recent progress in polymer-based gene delivery vectors

    Institute of Scientific and Technical Information of China (English)

    HUANG Shiwen; ZHUO Renxi

    2003-01-01

    The gene delivery system is one of the three components of a gene medicine, which is the bottle neck of current gene therapy. Nonviral vectors offer advantages over the viral system of safety, ease of manufacturing, etc. As important nonviral vectors, polymer gene delivery systems have gained increasing attention and have begun to show increasing promising. In this review, the fundamental and recent progress of polymer-based gene delivery vectors is reviewed.

  4. Evidence based selection of housekeeping genes.

    Directory of Open Access Journals (Sweden)

    Hendrik J M de Jonge

    Full Text Available For accurate and reliable gene expression analysis, normalization of gene expression data against housekeeping genes (reference or internal control genes is required. It is known that commonly used housekeeping genes (e.g. ACTB, GAPDH, HPRT1, and B2M vary considerably under different experimental conditions and therefore their use for normalization is limited. We performed a meta-analysis of 13,629 human gene array samples in order to identify the most stable expressed genes. Here we show novel candidate housekeeping genes (e.g. RPS13, RPL27, RPS20 and OAZ1 with enhanced stability among a multitude of different cell types and varying experimental conditions. None of the commonly used housekeeping genes were present in the top 50 of the most stable expressed genes. In addition, using 2,543 diverse mouse gene array samples we were able to confirm the enhanced stability of the candidate novel housekeeping genes in another mammalian species. Therefore, the identified novel candidate housekeeping genes seem to be the most appropriate choice for normalizing gene expression data.

  5. Target-based vs. phenotypic screenings in Leishmania drug discovery: A marriage of convenience or a dialogue of the deaf?

    OpenAIRE

    Reguera, Rosa M.; Estefanía Calvo-Álvarez; Raquel Álvarez-Velilla; Rafael Balaña-Fouce

    2014-01-01

    Drug discovery programs sponsored by public or private initiatives pursue the same ambitious goal: a crushing defeat of major Neglected Tropical Diseases (NTDs) during this decade. Both target-based and target-free screenings have pros and cons when it comes to finding potential small-molecule leads among chemical libraries consisting of myriads of compounds. Within the target-based strategy, crystals of pathogen recombinant-proteins are being used to obtain three-dimensional (3D) structures ...

  6. Ensemble-Based Virtual Screening Led to the Discovery of New Classes of Potent Tyrosinase Inhibitors.

    Science.gov (United States)

    Choi, Joonhyeok; Choi, Kwang-Eun; Park, Sung Jean; Kim, Sun Yeou; Jee, Jun-Goo

    2016-02-22

    In this study, we report new classes of potent tyrosinase inhibitors identified by enhanced structure-based virtual screening prediction; the enzyme and melanin content assays were also confirmed. Tyrosinase, a type-3 copper protein, participates in two distinct reactions, hydroxylation of tyrosine to DOPA and conversion of DOPA to dopaquinone, in melanin biosynthesis. Although numerous inhibitors of this reaction have been reported, there is a lag in the discovery of the new functional moieties. In order to improve the performance of virtual screening, we first produced an ensemble of 10,000 structures using molecular dynamics simulation. Quantum mechanical calculation was used to determine the partial charges of catalytic copper ions based on the met and deoxy states. Second, we selected a structure showing an optimal receiver operating characteristic (ROC) curve with known direct binders and their physicochemically matched decoys. The structure revealed more than 10-fold higher enrichment at 1% of the ROC curve than those observed in X-ray structures. Third, high-throughput virtual screening with DOCK 3.6 was performed using a library consisting of approximately 400,000 small molecules derived from the ZINC database. Fourth, we obtained the top 60 molecules and tested their inhibition of mushroom tyrosinase. The extended assays included 21 analogs of the 21 initial hits to test their inhibition properties. Here, the moieties of tetrazole and triazole were identified as new binding cores interacting with the dicopper catalytic center. All 42 inhibitors showed inhibitory constant, Ki, values ranging from 11.1 nM and 33.4 μM, with a tetrazole compound exhibiting the strongest activity. Among the 42 molecules, five displayed more than 30% reduction in melanin production when treated in B16F10 melanoma cells; cell viability was >90% at 20 μM. Particularly, a thiosemicarbazone-containing compound reduced melanin content by 55%. PMID:26750991

  7. Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery

    Directory of Open Access Journals (Sweden)

    Garner Harold R

    2009-05-01

    Full Text Available Abstract Background Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states. Results The proposed methodology was applied to a pilot narcolepsy study using logistic regression, hierarchical clustering, t-test, and CART. Consensus, differential mass peaks with high predictive power were identified across three of the four statistical platforms. Based on the diagnostic accuracy measures investigated, the performance of the consensus-peak model was a compromise between logistic regression and CART, which produced better models than hierarchical clustering and t-test. However, consensus peaks confer a higher level of confidence in their ability to distinguish between disease states since they do not represent peaks that are a result of biases to a particular statistical algorithm. Instead, they were selected as differential across differing data distribution assumptions, demonstrating their true discriminatory potential. Conclusion The methodology described here is applicable to any high-resolution MALDI mass spectrometry-derived data set with minimal mass drift which is essential for peak-to-peak comparison studies. Four statistical approaches with differing data distribution assumptions were applied to the same raw data set to obtain consensus peaks that were found to be statistically differential between the two groups compared. These consensus peaks demonstrated high diagnostic accuracy when used to form a predictive model as evaluated by receiver operating characteristics

  8. [Discovery-based teaching and learning strategies in health: problematization and problem-based learning].

    Science.gov (United States)

    Cyrino, Eliana Goldfarb; Toralles-Pereira, Maria Lúcia

    2004-01-01

    Considering the changes in teaching in the health field and the demand for new ways of dealing with knowledge in higher learning, the article discusses two innovative methodological approaches: problem-based learning (PBL) and problematization. Describing the two methods' theoretical roots, the article attempts to identify their main foundations. As distinct proposals, both contribute to a review of the teaching and learning process: problematization, focused on knowledge construction in the context of the formation of a critical awareness; PBL, focused on cognitive aspects in the construction of concepts and appropriation of basic mechanisms in science. Both problematization and PBL lead to breaks with the traditional way of teaching and learning, stimulating participatory management by actors in the experience and reorganization of the relationship between theory and practice. The critique of each proposal's possibilities and limits using the analysis of their theoretical and methodological foundations leads us to conclude that pedagogical experiences based on PBL and/or problematization can represent an innovative trend in the context of health education, fostering breaks and more sweeping changes. PMID:15263989

  9. Random SNPs Discovery from Genome and Target Gene of Turbot (Scophthalmus maximus) by Using Magnetic Beads%磁珠富集法随机筛选大菱鲆基因组SNP标记

    Institute of Scientific and Technical Information of China (English)

    董晓丽; 徐建勇; 陈松林

    2013-01-01

    利用磁珠富集法随机筛选了大菱鲆(Scophthalmus maximus)基因组7820bp的序列,获得了35个SNP标记,SNP标记的含量约为0.448%.超过68%的SNP标记由碱基转换造成,不足29%的SNP标记由碱基的颠换造成.使用磁珠富集法对目的基因KC70的检测发现,725bp的片段上发现7个SNP标记.此结果证实,该方法不仅能够随机筛选基因组SNP标记,还能筛选目的基因的SNP标记.%The Magnetic Beads was used for randomly discovering SNPs from turbot (Scophthalmus muximus) genome DNA. 35 SNPs were detected from 7820 bp DNA sequences, about 0.448%. More than 68 percents of the SNPs were caused by base transition and less than 29 percents resulted from transversion. Seven SNPs were found from the 725 bp fragment of KC70 gene. The findings suggest that the new method be not only useful for random SNPs discovery but also for the SNPs discovery of target genes.

  10. Automated discovery of tissue-targeting enhancers and transcription factors from binding motif and gene function data.

    Directory of Open Access Journals (Sweden)

    Geetu Tuteja

    2014-01-01

    Full Text Available Identifying enhancers regulating gene expression remains an important and challenging task. While recent sequencing-based methods provide epigenomic characteristics that correlate well with enhancer activity, it remains onerous to comprehensively identify all enhancers across development. Here we introduce a computational framework to identify tissue-specific enhancers evolving under purifying selection. First, we incorporate high-confidence binding site predictions with target gene functional enrichment analysis to identify transcription factors (TFs likely functioning in a particular context. We then search the genome for clusters of binding sites for these TFs, overcoming previous constraints associated with biased manual curation of TFs or enhancers. Applying our method to the placenta, we find 33 known and implicate 17 novel TFs in placental function, and discover 2,216 putative placenta enhancers. Using luciferase reporter assays, 31/36 (86% tested candidates drive activity in placental cells. Our predictions agree well with recent epigenomic data in human and mouse, yet over half our loci, including 7/8 (87% tested regions, are novel. Finally, we establish that our method is generalizable by applying it to 5 additional tissues: heart, pancreas, blood vessel, bone marrow, and liver.

  11. New insight into genes in association with asthma: literature-based mining and network centrality analysis

    Institute of Scientific and Technical Information of China (English)

    LIANG Rui; WANG Lei; WANG Gang

    2013-01-01

    Background Asthma is a heterogeneous disease for which a strong genetic basis has been firmly established.Until now no studies have been undertaken to systemically explore the network of asthma-related genes using an internally developed literature-based discovery approach.This study was to explore asthma-related genes by using literaturebased mining and network centrality analysis.Methods Literature involving asthma-related genes were searched in PubMed from 2001 to 2011.Integration of natural language processing with network centrality analysis was used to identify asthma susceptibility genes and their interaction network.Asthma susceptibility genes were classified into three functional groups by gene ontology (GO) analysis and the key genes were confirmed by establishing asthma-related networks and pathways.Results Three hundred and twenty-six genes related with asthma such as IGHE (IgE),interleukin (IL)-4,5,6,10,13,17A,and tumor necrosis factor (TNF)-alpha were identified.GO analysis indicated some biological processes (developmental processes,signal transduction,death,etc.),cellular components (non-structural extracellular,plasma membrane and extracellular matrix),and molecular functions (signal transduction activity) that were involved in asthma.Furthermore,22 asthma-related pathways such as the Toll-like receptor signaling pathway,hematopoietic cell lineage,JAK-STAT signaling pathway,chemokine signaling pathway,and cytokine-cytokine receptor interaction,and 17 hub genes,such as JAK3,CCR1-3,CCR5-7,CCR8,were found.Conclusions Our study provides a remarkably detailed and comprehensive picture of asthma susceptibility genes and their interacting network.Further identification of these genes and molecular pathways may play a prominent role in establishing rational therapeutic approaches for asthma.

  12. Analysis of tumor suppressor genes based on gene ontology and the KEGG pathway.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Cancer is a serious disease that causes many deaths every year. We urgently need to design effective treatments to cure this disease. Tumor suppressor genes (TSGs are a type of gene that can protect cells from becoming cancerous. In view of this, correct identification of TSGs is an alternative method for identifying effective cancer therapies. In this study, we performed gene ontology (GO and pathway enrichment analysis of the TSGs and non-TSGs. Some popular feature selection methods, including minimum redundancy maximum relevance (mRMR and incremental feature selection (IFS, were employed to analyze the enrichment features. Accordingly, some GO terms and KEGG pathways, such as biological adhesion, cell cycle control, genomic stability maintenance and cell death regulation, were extracted, which are important factors for identifying TSGs. We hope these findings can help in building effective prediction methods for identifying TSGs and thereby, promoting the discovery of effective cancer treatments.

  13. Cultivation of hard-to-culture subsurface mercury-resistant bacteria and discovery of new merA gene sequences

    DEFF Research Database (Denmark)

    Rasmussen, L D; Zawadsky, C; Binnerup, S J;

    2008-01-01

    sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of...

  14. A knowledge-based clustering algorithm driven by Gene Ontology.

    Science.gov (United States)

    Cheng, Jill; Cline, Melissa; Martin, John; Finkelstein, David; Awad, Tarif; Kulp, David; Siani-Rose, Michael A

    2004-08-01

    We have developed an algorithm for inferring the degree of similarity between genes by using the graph-based structure of Gene Ontology (GO). We applied this knowledge-based similarity metric to a clique-finding algorithm for detecting sets of related genes with biological classifications. We also combined it with an expression-based distance metric to produce a co-cluster analysis, which accentuates genes with both similar expression profiles and similar biological characteristics and identifies gene clusters that are more stable and biologically meaningful. These algorithms are demonstrated in the analysis of MPRO cell differentiation time series experiments. PMID:15468759

  15. An agent-based peer-to-peer architecture for semantic discovery of manufacturing services across virtual enterprises

    Science.gov (United States)

    Zhang, Wenyu; Zhang, Shuai; Cai, Ming; Jian, Wu

    2015-04-01

    With the development of virtual enterprise (VE) paradigm, the usage of serviceoriented architecture (SOA) is increasingly being considered for facilitating the integration and utilisation of distributed manufacturing resources. However, due to the heterogeneous nature among VEs, the dynamic nature of a VE and the autonomous nature of each VE member, the lack of both sophisticated coordination mechanism in the popular centralised infrastructure and semantic expressivity in the existing SOA standards make the current centralised, syntactic service discovery method undesirable. This motivates the proposed agent-based peer-to-peer (P2P) architecture for semantic discovery of manufacturing services across VEs. Multi-agent technology provides autonomous and flexible problemsolving capabilities in dynamic and adaptive VE environments. Peer-to-peer overlay provides highly scalable coupling across decentralised VEs, each of which exhibiting as a peer composed of multiple agents dealing with manufacturing services. The proposed architecture utilises a novel, efficient, two-stage search strategy - semantic peer discovery and semantic service discovery - to handle the complex searches of manufacturing services across VEs through fast peer filtering. The operation and experimental evaluation of the prototype system are presented to validate the implementation of the proposed approach.

  16. Combining Metabolite-Based Pharmacophores with Bayesian Machine Learning Models for Mycobacterium tuberculosis Drug Discovery

    Science.gov (United States)

    Sarker, Malabika; Li, Shao-Gang; Mittal, Nisha; Kumar, Pradeep; Wang, Xin; Stratton, Thomas P.; Zimmerman, Matthew; Talcott, Carolyn; Bourbon, Pauline; Travers, Mike; Yadav, Maneesh

    2015-01-01

    Integrated computational approaches for Mycobacterium tuberculosis (Mtb) are useful to identify new molecules that could lead to future tuberculosis (TB) drugs. Our approach uses information derived from the TBCyc pathway and genome database, the Collaborative Drug Discovery TB database combined with 3D pharmacophores and dual event Bayesian models of whole-cell activity and lack of cytotoxicity. We have prioritized a large number of molecules that may act as mimics of substrates and metabolites in the TB metabolome. We computationally searched over 200,000 commercial molecules using 66 pharmacophores based on substrates and metabolites from Mtb and further filtering with Bayesian models. We ultimately tested 110 compounds in vitro that resulted in two compounds of interest, BAS 04912643 and BAS 00623753 (MIC of 2.5 and 5 μg/mL, respectively). These molecules were used as a starting point for hit-to-lead optimization. The most promising class proved to be the quinoxaline di-N-oxides, evidenced by transcriptional profiling to induce mRNA level perturbations most closely resembling known protonophores. One of these, SRI58 exhibited an MIC = 1.25 μg/mL versus Mtb and a CC50 in Vero cells of >40 μg/mL, while featuring fair Caco-2 A-B permeability (2.3 x 10−6 cm/s), kinetic solubility (125 μM at pH 7.4 in PBS) and mouse metabolic stability (63.6% remaining after 1 h incubation with mouse liver microsomes). Despite demonstration of how a combined bioinformatics/cheminformatics approach afforded a small molecule with promising in vitro profiles, we found that SRI58 did not exhibit quantifiable blood levels in mice. PMID:26517557

  17. Common minor histocompatibility antigen discovery based upon patient clinical outcomes and genomic data.

    Directory of Open Access Journals (Sweden)

    Paul M Armistead

    Full Text Available BACKGROUND: Minor histocompatibility antigens (mHA mediate much of the graft vs. leukemia (GvL effect and graft vs. host disease (GvHD in patients who undergo allogeneic stem cell transplantation (SCT. Therapeutic decision making and treatments based upon mHAs will require the evaluation of multiple candidate mHAs and the selection of those with the potential to have the greatest impact on clinical outcomes. We hypothesized that common, immunodominant mHAs, which are presented by HLA-A, B, and C molecules, can mediate clinically significant GvL and/or GvHD, and that these mHAs can be identified through association of genomic data with clinical outcomes. METHODOLOGY/PRINCIPAL FINDINGS: Because most mHAs result from donor/recipient cSNP disparities, we genotyped 57 myeloid leukemia patients and their donors at 13,917 cSNPs. We correlated the frequency of genetically predicted mHA disparities with clinical evidence of an immune response and then computationally screened all peptides mapping to the highly associated cSNPs for their ability to bind to HLA molecules. As proof-of-concept, we analyzed one predicted antigen, T4A, whose mHA mismatch trended towards improved overall and disease free survival in our cohort. T4A mHA mismatches occurred at the maximum theoretical frequency for any given SCT. T4A-specific CD8+ T lymphocytes (CTLs were detected in 3 of 4 evaluable post-transplant patients predicted to have a T4A mismatch. CONCLUSIONS/SIGNIFICANCE: Our method is the first to combine clinical outcomes data with genomics and bioinformatics methods to predict and confirm a mHA. Refinement of this method should enable the discovery of clinically relevant mHAs in the majority of transplant patients and possibly lead to novel immunotherapeutics.

  18. Antibody-Array-Based Proteomic Screening of Serum Markers in Systemic Lupus Erythematosus: A Discovery Study.

    Science.gov (United States)

    Wu, Tianfu; Ding, Huihua; Han, Jie; Arriens, Cristina; Wei, Chungwen; Han, Weilu; Pedroza, Claudia; Jiang, Shan; Anolik, Jennifer; Petri, Michelle; Sanz, Ignacio; Saxena, Ramesh; Mohan, Chandra

    2016-07-01

    A discovery study was carried out where serum samples from 22 systemic lupus erythematosus (SLE) patients and matched healthy controls were hybridized to antibody-coated glass slide arrays that interrogated the level of 274 human proteins. On the basis of these screens, 48 proteins were selected for ELISA-based validation in an independent cohort of 28 SLE patients. Whereas AXL, ferritin, and sTNFRII were significantly elevated in patients with active lupus nephritis (LN) relative to SLE patients who were quiescent, other molecules such as OPN, sTNFRI, sTNFRII, IGFBP2, SIGLEC5, FAS, and MMP10 exhibited the capacity to distinguish SLE from healthy controls with ROC AUC exceeding 90%, all with p serum markers were next tested in a cohort of 45 LN patients, where serum was obtained at the time of renal biopsy. In these patients, sTNFRII exhibited the strongest correlation with eGFR (r = -0.50, p = 0.0014) and serum creatinine (r = 0.57, p = 0.0001), although AXL, FAS, and IGFBP2 also correlated with these clinical measures of renal function. When concurrent renal biopsies from these patients were examined, serum FAS, IGFBP2, and TNFRII showed significant positive correlations with renal pathology activity index, while sTNFRII displayed the highest correlation with concurrently scored renal pathology chronicity index (r = 0.57, p = 0.001). Finally, in a longitudinal cohort of seven SLE patients examined at ∼3 month intervals, AXL, ICAM-1, IGFBP2, SIGLEC5, sTNFRII, and VCAM-1 demonstrated the ability to track with concurrent disease flare, with significant subject to subject variation. In summary, serum proteins have the capacity to identify patients with active nephritis, flares, and renal pathology activity or chronicity changes, although larger longitudinal cohort studies are warranted. PMID:27211902

  19. Recent developments in StemBase: a tool to study gene expression in human and murine stem cells

    OpenAIRE

    Krzyzanowski Paul M; Porter Christopher J; Huska Matthew R; Palidwor Gareth A; Sandie Reatha; Muro Enrique M; Perez-Iratxeta Carolina; Andrade-Navarro Miguel A

    2009-01-01

    Abstract Background Currently one of the largest online repositories for human and mouse stem cell gene expression data, StemBase was first designed as a simple web-interface to DNA microarray data generated by the Canadian Stem Cell Network to facilitate the discovery of gene functions relevant to stem cell control and differentiation. Findings Since its creation, StemBase has grown in both size and scope into a system with analysis tools that examine either the whole database at once, or sl...

  20. Target-based vs. phenotypic screenings in Leishmania drug discovery: A marriage of convenience or a dialogue of the deaf?

    Science.gov (United States)

    Reguera, Rosa M.; Calvo-Álvarez, Estefanía; Álvarez-Velilla, Raquel; Balaña-Fouce, Rafael

    2014-01-01

    Drug discovery programs sponsored by public or private initiatives pursue the same ambitious goal: a crushing defeat of major Neglected Tropical Diseases (NTDs) during this decade. Both target-based and target-free screenings have pros and cons when it comes to finding potential small-molecule leads among chemical libraries consisting of myriads of compounds. Within the target-based strategy, crystals of pathogen recombinant-proteins are being used to obtain three-dimensional (3D) structures in silico for the discovery of structure-based inhibitors. On the other hand, genetically modified parasites expressing easily detectable reporters are in the pipeline of target-free (phenotypic) screenings. Furthermore, lead compounds can be scaled up to in vivo preclinical trials using rodent models of infection monitoring parasite loads by means of cutting-edge bioimaging devices. As such, those preferred are fluorescent and bioluminescent readouts due to their reproducibility and rapidity, which reduces the number of animals used in the trials and allows for an earlier stage detection of the infective process as compared with classical methods. In this review, we focus on the current differences between target-based and phenotypic screenings in Leishmania, as an approach that leads to the discovery of new potential drugs against leishmaniasis. PMID:25516847

  1. Target-based vs. phenotypic screenings in Leishmania drug discovery: A marriage of convenience or a dialogue of the deaf?

    Directory of Open Access Journals (Sweden)

    Rosa M. Reguera

    2014-12-01

    Full Text Available Drug discovery programs sponsored by public or private initiatives pursue the same ambitious goal: a crushing defeat of major Neglected Tropical Diseases (NTDs during this decade. Both target-based and target-free screenings have pros and cons when it comes to finding potential small-molecule leads among chemical libraries consisting of myriads of compounds. Within the target-based strategy, crystals of pathogen recombinant-proteins are being used to obtain three-dimensional (3D structures in silico for the discovery of structure-based inhibitors. On the other hand, genetically modified parasites expressing easily detectable reporters are in the pipeline of target-free (phenotypic screenings. Furthermore, lead compounds can be scaled up to in vivo preclinical trials using rodent models of infection monitoring parasite loads by means of cutting-edge bioimaging devices. As such, those preferred are fluorescent and bioluminescent readouts due to their reproducibility and rapidity, which reduces the number of animals used in the trials and allows for an earlier stage detection of the infective process as compared with classical methods. In this review, we focus on the current differences between target-based and phenotypic screenings in Leishmania, as an approach that leads to the discovery of new potential drugs against leishmaniasis.

  2. Target-based vs. phenotypic screenings in Leishmania drug discovery: A marriage of convenience or a dialogue of the deaf?

    Science.gov (United States)

    Reguera, Rosa M; Calvo-Álvarez, Estefanía; Alvarez-Velilla, Raquel; Balaña-Fouce, Rafael

    2014-12-01

    Drug discovery programs sponsored by public or private initiatives pursue the same ambitious goal: a crushing defeat of major Neglected Tropical Diseases (NTDs) during this decade. Both target-based and target-free screenings have pros and cons when it comes to finding potential small-molecule leads among chemical libraries consisting of myriads of compounds. Within the target-based strategy, crystals of pathogen recombinant-proteins are being used to obtain three-dimensional (3D) structures in silico for the discovery of structure-based inhibitors. On the other hand, genetically modified parasites expressing easily detectable reporters are in the pipeline of target-free (phenotypic) screenings. Furthermore, lead compounds can be scaled up to in vivo preclinical trials using rodent models of infection monitoring parasite loads by means of cutting-edge bioimaging devices. As such, those preferred are fluorescent and bioluminescent readouts due to their reproducibility and rapidity, which reduces the number of animals used in the trials and allows for an earlier stage detection of the infective process as compared with classical methods. In this review, we focus on the current differences between target-based and phenotypic screenings in Leishmania, as an approach that leads to the discovery of new potential drugs against leishmaniasis. PMID:25516847

  3. Cyclodextrin-based gene delivery systems

    OpenAIRE

    Ortiz-Mellet, Carmen; García Fernández, José M.; Benito, Juan M.

    2011-01-01

    Cyclodextrin (CD) history has been largely dominated by their unique ability to form inclusion complexes with guests fitting in their hydrophobic cavity. Chemical funcionalization was soon recognized as a powerful mean for improving CD applications in a wide range of fields, including drug delivery, sensing or enzyme mimicking. However, 100 years after their discovery, CDs are still perceived as novel nanoobjects of undeveloped potential. This critical review provides an overview of different...

  4. The Analysis of Multiple Genome Comparisons in Genus Escherichia and Its Application to the Discovery of Uncharacterised Metabolic Genes in Uropathogenic Escherichia coli CFT073

    Directory of Open Access Journals (Sweden)

    William A. Bryant

    2009-01-01

    Full Text Available A survey of a complete gene synteny comparison has been carried out between twenty fully sequenced strains from the genus Escherichia with the aim of finding yet uncharacterised genes implicated in the metabolism of uropathogenic strains of E. coli (UPEC. Several sets of adjacent colinear genes have been identified which are present in all four UPEC included in this study (CFT073, F11, UTI89, and 536, annotated with putative metabolic functions, but are not found in any other strains considered. An operon closely homologous to that encoding the L-sorbose degradation pathway in Klebsiella pneumoniae has been identified in E. coli CFT073; this operon is present in all of the UPEC considered, but only in 7 of the other 16 strains. The operon's function has been confirmed by cloning the genes into E. coli DH5α and testing for growth on L-sorbose. The functional genomic approach combining in silico and in vitro work presented here can be used as a basis for the discovery of other uncharacterised genes contributing to bacterial survival in specific environments.

  5. AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery

    OpenAIRE

    Wallach, Izhar; Dzamba, Michael; Heifets, Abraham

    2015-01-01

    Deep convolutional neural networks comprise a subclass of deep neural networks (DNN) with a constrained architecture that leverages the spatial and temporal structure of the domain they model. Convolutional networks achieve the best predictive performance in areas such as speech and image recognition by hierarchically composing simple local features into complex models. Although DNNs have been used in drug discovery for QSAR and ligand-based bioactivity predictions, none of these models have ...

  6. Milp-hyperbox classification for structure-based drug design in the discovery of small molecule inhibitors of SIRTUIN6

    OpenAIRE

    Tardu, Mehmet; Rahim, Fatih; Kavaklı, İbrahim Halil; Türkay, Metin

    2016-01-01

    Virtual screening of chemical libraries following experimental assays of drug candidates is a common procedure in structure-based drug discovery. However, virtual screening of chemical libraries with millions of compounds requires a lot of time for computing and data analysis. A priori classification of compounds in the libraries as low-and high-binding free energy sets decreases the number of compounds for virtual screening experiments. This classification also reduces the required computati...

  7. Ontology-Based Context-Aware Service Discovery for Pervasive Environments

    NARCIS (Netherlands)

    Pawar, P.; Tokmakoff, A.

    2006-01-01

    Existing service discovery protocols use a service matching process in order to offer services of interest to the clients. Potentially, the context information of the services and client can be used to improve the quality of service matching. To make use of context information in service matching, s

  8. Towards a goal-based service framework for dynamic service discovery and composition

    NARCIS (Netherlands)

    Bonino da Silva Santos, Luiz Olavo; Silva, Eduardo Goncalves; Ferreira Pires, Luis; Sinderen, van Marten

    2009-01-01

    Service-Oriented Computing allows new applications to be developed by using and/or combining services offered by different providers. Service discovery and composition are performed aiming to comply with the client’s request in terms of functionality and expected outcome. In this paper we present a

  9. Engineering Application Way of Faults Knowledge Discovery Based on Rough Set Theory

    International Nuclear Information System (INIS)

    For the knowledge acquisition puzzle of intelligence decision-making technology in mechanical industry, to use the Rough Set Theory (RST) as a kind of tool to solve the puzzle was researched. And the way to realize the knowledge discovery in engineering application is explored. A case extracting out the knowledge rules from a concise data table shows out some important information. It is that the knowledge discovery similar to the mechanical faults diagnosis is an item of complicated system engineering project. In where, first of all-important tasks is to preserve the faults knowledge into a table with data mode. And the data must be derived from the plant site and should also be as concise as possible. On the basis of the faults knowledge data obtained so, the methods and algorithms to process the data and extract the knowledge rules from them by means of RST can be processed only. The conclusion is that the faults knowledge discovery by the way is a process of rising upward. But to develop the advanced faults diagnosis technology by the way is a large-scale knowledge engineering project for long time. Every step in which should be designed seriously according to the tool's demands firstly. This is the basic guarantees to make the knowledge rules obtained have the values of engineering application and the studies have scientific significance. So, a general framework is designed for engineering application to go along the route developing the faults knowledge discovery technology.

  10. Optimizing Neighbor Discovery for Ad hoc Networks based on the Bluetooth PAN Profile

    DEFF Research Database (Denmark)

    Kuijpers, Gerben; Nielsen, Thomas Toftegaard; Prasad, Ramjee

    2002-01-01

    . This paper introduces a neighbor discovery mechanism that utilizes the resources in the Bluetooth PAN profile more efficient. The performance of the new mechanism is investigated using a IPv6 network simulator and compared with emulated broadcasting. It is shown that the signaling overhead can...

  11. Engineering Application Way of Faults Knowledge Discovery Based on Rough Set Theory

    Energy Technology Data Exchange (ETDEWEB)

    Zhao Rongzhen; Deng Linfeng [Key Laboratory of Digital Manufacturing Technology and Application of the Ministry of Education, School of Mechanical and Electronical Engineering, Lanzhou Universityof Tech. Lanzhou, 730050 (China); Li Chao, E-mail: zhaorongzhen@lut.cn [College of Petrochemical Technology Lanzhou Universityof Tech. Lanzhou, 730050 (China)

    2011-07-19

    For the knowledge acquisition puzzle of intelligence decision-making technology in mechanical industry, to use the Rough Set Theory (RST) as a kind of tool to solve the puzzle was researched. And the way to realize the knowledge discovery in engineering application is explored. A case extracting out the knowledge rules from a concise data table shows out some important information. It is that the knowledge discovery similar to the mechanical faults diagnosis is an item of complicated system engineering project. In where, first of all-important tasks is to preserve the faults knowledge into a table with data mode. And the data must be derived from the plant site and should also be as concise as possible. On the basis of the faults knowledge data obtained so, the methods and algorithms to process the data and extract the knowledge rules from them by means of RST can be processed only. The conclusion is that the faults knowledge discovery by the way is a process of rising upward. But to develop the advanced faults diagnosis technology by the way is a large-scale knowledge engineering project for long time. Every step in which should be designed seriously according to the tool's demands firstly. This is the basic guarantees to make the knowledge rules obtained have the values of engineering application and the studies have scientific significance. So, a general framework is designed for engineering application to go along the route developing the faults knowledge discovery technology.

  12. Endophytes : exploiting biodiversity for the improvement of natural product-based drug discovery

    NARCIS (Netherlands)

    Staniek, Agata; Woerdenbag, Herman J.; Kayser, Oliver

    2008-01-01

    Endophytes, microorganisms that colonize internal tissues of all plant species, create a huge biodiversity with yet unknown novel natural products, presumed to push forward the frontiers of drug discovery. Next to the clinically acknowledged antineoplastic agent, paclitaxel, endophyte research has y

  13. Use of arbitrary DNA primers, polyacrylamide gel electrophoresis and silver staining for identity testing, gene discovery and analysis of gene expression

    International Nuclear Information System (INIS)

    To understand chemically-induced genomic differences in soybean mutants differing in their ability to enter the nitrogen-fixing symbiosis involving Bradyrhizobium japonicum, molecular techniques were developed to aid the map-based, or positional, cloning. DNA marker technology involving single arbitrary primers was used to enrich regional RFLP linkage data. Molecular techniques, including two-dimensional pulse field gel electrophoresis, were developed to ascertain the first physical mapping in soybean, leading to the conclusion that in the region of marker pA-36 on linkage group H, 1 cM equals about 500 cM. High molecular weight DNA was isolated and cloned into yeast or bacterial artificial chromosomes (YACs/ BACs). YACs were used to analyze soybean genome structure, revealing that over half of the genome contains repetitive DNA. Genetic and molecular tools are now available to facilitate the isolation of plant genes directly involved in symbiosis. The further characterization of these genes, along with the determination of the mechanisms that lead to the mutation, will be of value to other plants and induced mutation research. (author)

  14. Computational drug discovery

    Institute of Scientific and Technical Information of China (English)

    Si-sheng OU-YANG; Jun-yan LU; Xiang-qian KONG; Zhong-jie LIANG; Cheng LUO; Hualiang JIANG

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process.Because of the dramatic increase in the availability of biological macromolecule and small molecule information,the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow,including target identification and validation,lead discovery and optimization and preclinical tests.Over the past decades,computational drug discovery methods such as molecular docking,pharmacophore modeling and mapping,de novo design,molecular similarity calculation and sequence-based virtual screening have been greatly improved.In this review,we present an overview of these important computational methods,platforms and successful applications in this field.

  15. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Institute of Scientific and Technical Information of China (English)

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  16. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  17. Rapid countermeasure discovery against Francisella tularensis based on a metabolic network reconstruction.

    Directory of Open Access Journals (Sweden)

    Sidhartha Chaudhury

    Full Text Available In the future, we may be faced with the need to provide treatment for an emergent biological threat against which existing vaccines and drugs have limited efficacy or availability. To prepare for this eventuality, our objective was to use a metabolic network-based approach to rapidly identify potential drug targets and prospectively screen and validate novel small-molecule antimicrobials. Our target organism was the fully virulent Francisella tularensis subspecies tularensis Schu S4 strain, a highly infectious intracellular pathogen that is the causative agent of tularemia and is classified as a category A biological agent by the Centers for Disease Control and Prevention. We proceeded with a staggered computational and experimental workflow that used a strain-specific metabolic network model, homology modeling and X-ray crystallography of protein targets, and ligand- and structure-based drug design. Selected compounds were subsequently filtered based on physiological-based pharmacokinetic modeling, and we selected a final set of 40 compounds for experimental validation of antimicrobial activity. We began screening these compounds in whole bacterial cell-based assays in biosafety level 3 facilities in the 20th week of the study and completed the screens within 12 weeks. Six compounds showed significant growth inhibition of F. tularensis, and we determined their respective minimum inhibitory concentrations and mammalian cell cytotoxicities. The most promising compound had a low molecular weight, was non-toxic, and abolished bacterial growth at 13 µM, with putative activity against pantetheine-phosphate adenylyltransferase, an enzyme involved in the biosynthesis of coenzyme A, encoded by gene coaD. The novel antimicrobial compounds identified in this study serve as starting points for lead optimization, animal testing, and drug development against tularemia. Our integrated in silico/in vitro approach had an overall 15% success rate in terms of

  18. Identifying disease feature genes based on cellular localized gene functional modules and regulation networks

    Institute of Scientific and Technical Information of China (English)

    ZHANG Min; ZHU Jing; GUO Zheng; LI Xia; YANG Da; WANG Lei; RAO Shaoqi

    2006-01-01

    Identifying disease-relevant genes and functional modules, based on gene expression profiles and gene functional knowledge, is of high importance for studying disease mechanisms and subtyping disease phenotypes. Using gene categories of biological process and cellular component in Gene Ontology, we propose an approach to selecting functional modules enriched with differentially expressed genes, and identifying the feature functional modules of high disease discriminating abilities. Using the differentially expressed genes in each feature module as the feature genes, we reveal the relevance of the modules to the studied diseases. Using three datasets for prostate cancer, gastric cancer, and leukemia, we have demonstrated that the proposed modular approach is of high power in identifying functionally integrated feature gene subsets that are highly relevant to the disease mechanisms. Our analysis has also shown that the critical disease-relevant genes might be better recognized from the gene regulation network, which is constructed using the characterized functional modules, giving important clues to the concerted mechanisms of the modules responding to complex disease states. In addition, the proposed approach to selecting the disease-relevant genes by jointly considering the gene functional knowledge suggests a new way for precisely classifying disease samples with clear biological interpretations, which is critical for the clinical diagnosis and the elucidation of the pathogenic basis of complex diseases.

  19. A prerecognition model for hot topic discovery based on microblogging data.

    Science.gov (United States)

    Zhu, Tongyu; Yu, Jianjun

    2014-01-01

    The microblogging is prevailing since its easy and anonymous information sharing at Internet, which also brings the issue of dispersing negative topics, or even rumors. Many researchers have focused on how to find and trace emerging topics for analysis. When adopting topic detection and tracking techniques to find hot topics with streamed microblogging data, it will meet obstacles like streamed microblogging data clustering, topic hotness definition, and emerging hot topic discovery. This paper schemes a novel prerecognition model for hot topic discovery. In this model, the concepts of the topic life cycle, the hot velocity, and the hot acceleration are promoted to calculate the change of topic hotness, which aims to discover those emerging hot topics before they boost and break out. Our experiments show that this new model would help to discover potential hot topics efficiently and achieve considerable performance. PMID:25254235

  20. Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA to fine-map genes in polyploid wheat

    Directory of Open Access Journals (Sweden)

    Trick Martin

    2012-01-01

    Full Text Available Abstract Background Next generation sequencing (NGS technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC gene GPC-B1. Bulked segregant analysis (BSA was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes.

  1. Linear Discriminant Analysis-Based Estimation of the False Discovery Rate for Phosphopeptide Identifications

    OpenAIRE

    Du, Xiuxia; Yang, Feng; Manes, Nathan P.; Stenoien, David L; Monroe, Matthew E.; Adkins, Joshua N; States, David J.; Purvine, Samuel O.; Camp, David G.; Smith, Richard D.

    2008-01-01

    The development of liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has made it possible to measure phosphopeptides on an increasingly large-scale and high-throughput fashion. However, extracting confident phosphopeptide identifications from the resulting large dataset in a similar high-throughput fashion remains difficult, as does rigorously estimating the false discovery rate (FDR) of a set of phosphopeptide identifications. This article describes a data analysis pipel...

  2. NMR in drug discovery. From screening to structure-based design of antitumoral agents

    OpenAIRE

    Rodríguez Mías, Ricard Aleix

    2006-01-01

    [eng] Nuclear Magnetic Resonance has experienced an increasing interest in the drug discovery field that has led to its wide use on nearly every stage of drug development. For this reason, during the present thesis we propose to use some of the tools offered by NMR to target various systems related with cancer. Initially we intended to get acquainted with the NMR most outstanding methodologies for the detection and characterization of binding events; and for this goal various proteins involve...

  3. Efficient semantic-based IoT service discovery mechanism for dynamic environments

    OpenAIRE

    Ben Fredj, Sameh; Boussard, Mathieu; Kofman, Daniel; Noirie, Ludovic

    2014-01-01

    —The adoption of Service Oriented Architecture (SOA) and Semantic Web technologies in the Internet of Things (IoT) enables to enhance the interoperability of devices by abstracting their capabilities as services and enriching their descriptions with machine-interpretable semantics. This facilitates the discovery and composition of IoT services. The increasing number of IoT services, their dynamicity and geographical distribution require to think about mechanisms to enable scalable and effecti...

  4. Implementation of a deidentified federated data network for population-based cohort discovery

    OpenAIRE

    Anderson, Nicholas; Abend, Aaron; Mandel, Aaron; Geraghty, Estella; Gabriel, Davera; Wynden, Rob; Kamerick, Michael; Anderson, Kent; Rainwater, Julie; Tarczy-Hornoch, Peter

    2011-01-01

    Objective The Cross-Institutional Clinical Translational Research project explored a federated query tool and looked at how this tool can facilitate clinical trial cohort discovery by managing access to aggregate patient data located within unaffiliated academic medical centers. Methods The project adapted software from the Informatics for Integrating Biology and the Bedside (i2b2) program to connect three Clinical Translational Research Award sites: University of Washington, Seattle, Univers...

  5. Drugs, structures, fragments: substructure-based approaches to GPCR drug discovery and design

    OpenAIRE

    Horst, Eelke van der

    2012-01-01

    This thesis is all about cheminformatics, and its impact on drug discovery. A number of strategies are discussed that apply computational methods for the analysis and design of G protein-coupled receptor (GPCR) ligands. Frequent substructure mining is applied to find the common structural motifs that are discriminative for predefined classes of GPCR ligands. In addtion, this approach is extended to cluster GPCRs to suggest a new classification for this receptor superfamily. Furthermore, subst...

  6. Stability-Based Comparison of Class Discovery Methods for DNA Copy Number Profiles

    OpenAIRE

    Brito, Isabel; Hupe, Philippe; Neuvial, Pierre; Barillot, Emmanuel

    2013-01-01

    Motivation: Array-CGH can be used to determine DNA copy number, imbalances in which are a fundamental factor in the genesis and progression of tumors. The discovery of classes with similar patterns of array-CGH profiles therefore adds to our understanding of cancer and the treatment of patients. Various input data representations for array-CGH, dissimilarity measures between tumor samples and clustering algorithms may be used for this purpose. The choice between procedures is often difficult....

  7. Discovery of Novel Human Epidermal Growth Factor Receptor-2 Inhibitors by Structure-based Virtual Screening

    OpenAIRE

    Shi, Zheng; Yu, Tian; Sun, Rong; Wang, Shan; Chen, Xiao-Qian; Cheng, Li-jia; Liu, Rong

    2016-01-01

    Background: Human epidermal growth factor receptor-2 (HER2) is a trans-membrane receptor like protein, and aberrant signaling of HER2 is implicated in many human cancers, such as ovarian cancer, gastric cancer, and prostate cancer, most notably breast cancer. Moreover, it has been in the spotlight in the recent years as a promising new target for therapy of breast cancer. Objective: Since virtual screening has become an integral part of the drug discovery process, it is of great significant t...

  8. Automated Sample Preparation Platform for Mass Spectrometry-Based Plasma Proteomics and Biomarker Discovery

    OpenAIRE

    Vilém Guryča; Daniel Roeder; Paolo Piraino; Jens Lamerz; Axel Ducret; Hanno Langen; Paul Cutler

    2014-01-01

    The identification of novel biomarkers from human plasma remains a critical need in order to develop and monitor drug therapies for nearly all disease areas. The discovery of novel plasma biomarkers is, however, significantly hampered by the complexity and dynamic range of proteins within plasma, as well as the inherent variability in composition from patient to patient. In addition, it is widely accepted that most soluble plasma biomarkers for diseases such as cancer will be represented by t...

  9. SLiMPrints: conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions

    Science.gov (United States)

    Davey, Norman E.; Cowan, Joanne L.; Shields, Denis C.; Gibson, Toby J.; Coldwell, Mark J.; Edwards, Richard J.

    2012-01-01

    Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E. PMID:22977176

  10. Genes underlying altruism

    OpenAIRE

    Thompson, Graham J.; Hurd, Peter L.; Crespi, Bernard J.

    2013-01-01

    William D. Hamilton postulated the existence of ‘genes underlying altruism’, under the rubric of inclusive fitness theory, a half-century ago. Such genes are now poised for discovery. In this article, we develop a set of intuitive criteria for the recognition and analysis of genes for altruism and describe the first candidate genes affecting altruism from social insects and humans. We also provide evidence from a human population for genetically based trade-offs, underlain by oxytocin-system ...

  11. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Science.gov (United States)

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance. PMID:24278262

  12. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Rongning Liu

    Full Text Available In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina. 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  13. DNA-energetics-based analyses suggest additional genes in prokaryotes

    Indian Academy of Sciences (India)

    Garima Khandelwal; Jalaj Gupta; B Jayaram

    2012-07-01

    We present here a novel methodology for predicting new genes in prokaryotic genomes on the basis of inherent energetics of DNA. Regions of higher thermodynamic stability were identified, which were filtered based on already known annotations to yield a set of potentially new genes. These were then processed for their compatibility with the stereo-chemical properties of proteins and tripeptide frequencies of proteins in Swissprot data, which results in a reliable set of new genes in a genome. Quite surprisingly, the methodology identifies new genes even in well-annotated genomes. Also, the methodology can handle genomes of any GC-content, size and number of annotated genes.

  14. Gene Network Biological Validity Based on Gene-Gene Interaction Relevance

    OpenAIRE

    Francisco Gómez-Vela; Norberto Díaz-Díaz

    2014-01-01

    In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in...

  15. HANDS: a tool for genome-wide discovery of subgenome-specific base-identity in polyploids.

    KAUST Repository

    Mithani, Aziz

    2013-09-24

    The analysis of polyploid genomes is problematic because homeologous subgenome sequences are closely related. This relatedness makes it difficult to assign individual sequences to the specific subgenome from which they are derived, and hinders the development of polyploid whole genome assemblies.We here present a next-generation sequencing (NGS)-based approach for assignment of subgenome-specific base-identity at sites containing homeolog-specific polymorphisms (HSPs): \\'HSP base Assignment using NGS data through Diploid Similarity\\' (HANDS). We show that HANDS correctly predicts subgenome-specific base-identity at >90% of assayed HSPs in the hexaploid bread wheat (Triticum aestivum) transcriptome, thus providing a substantial increase in accuracy versus previous methods for homeolog-specific base assignment.We conclude that HANDS enables rapid and accurate genome-wide discovery of homeolog-specific base-identity, a capability having multiple applications in polyploid genomics.

  16. Inhibition of Shikimate Kinase and Type II Dehydroquinase for Antibiotic Discovery: Structure-Based Design and Simulation Studies.

    Science.gov (United States)

    Gonzalez-Bello, Concepcion

    2016-01-01

    The loss of effectiveness of current antibiotics caused by the development of drug resistance has become a severe threat to public health. Current widely used antibiotics are surprisingly targeted at a few bacterial functions - cell wall, DNA, RNA, and protein biosynthesis - and resistance to them is widespread and well identified. There is therefore great interest in the discovery of novel drugs and therapies to tackle antimicrobial resistance, in particular drugs that target other essential processes for bacterial survival. In the past few years a great deal of effort has been focused on the discovery of new inhibitors of the enzymes involved in the biosynthesis of aromatic amino acids, also known as the shikimic acid pathway, in which chorismic acid is synthesized. The latter compound is the synthetic precursor of L-Phe, L-Tyr, L-Phe, and other important aromatic metabolites. These enzymes are recognized as attractive targets for the development of new antibacterial agents because they are essential in important pathogenic bacteria, such as Mycobacterium tuberculosis and Helicobacter pylori, but do not have any counterpart in human cells. This review is focused on two key enzymes of this pathway, shikimate kinase and type II dehydroquinase. An overview of the use of structure-based design and computational studies for the discovery of selective inhibitors of these enzymes will be provided. A detailed view of the structural changes caused by these inhibitors in the catalytic arrangement of these enzymes, which are responsible for the inhibition of their activity, is described. PMID:26303426

  17. Gene-based and semantic structure of the Gene Ontology as a complex network

    Science.gov (United States)

    Coronnello, Claudia; Tumminello, Michele; Miccichè, Salvatore

    2016-09-01

    The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The Gene Ontology (GO) is constantly evolving over time. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. Here we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium. Moreover, the GO is a natural example of bipartite network of terms and genes. Here we are interested in studying the properties of the projected network of terms, i.e. a gene-based weighted network of GO terms, in which a link between any two terms is set if at least one gene is annotated in both terms. One aim of the present paper is to compare the structural properties of the semantic and the gene-based network. The relative importance of terms is very similar in the two networks, but the community structure changes. We show that in some cases GO terms that appear to be distinct from a semantic point of view are instead connected, and appear in the same community when considering their gene content. The identification of such gene-based communities of terms might therefore be the basis of a simple protocol aiming at improving the semantic structure of GO. Information about terms that share large gene content might also be important from a biomedical point of view, as it might reveal how genes over-expressed in a certain term also affect other biological processes, molecular functions and cellular components not directly linked according to GO semantics.

  18. Seed-based biclustering of gene expression data.

    Directory of Open Access Journals (Sweden)

    Jiyuan An

    Full Text Available BACKGROUND: Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related genes generally do not show coherent expression across all conditions since any given cellular process is active only under a subset of conditions. Biclustering finds gene clusters that have similar expression levels across a subset of conditions. This paper proposes a seed-based algorithm that identifies coherent genes in an exhaustive, but efficient manner. METHODS: In order to find the biclusters in a gene expression dataset, we exhaustively select combinations of genes and conditions as seeds to create candidate bicluster tables. The tables have two columns (a a gene set, and (b the conditions on which the gene set have dissimilar expression levels to the seed. First, the genes with less than the maximum number of dissimilar conditions are identified and a table of these genes is created. Second, the rows that have the same dissimilar conditions are grouped together. Third, the table is sorted in ascending order based on the number of dissimilar conditions. Finally, beginning with the first row of the table, a test is run repeatedly to determine whether the cardinality of the gene set in the row is greater than the minimum threshold number of genes in a bicluster. If so, a bicluster is outputted and the corresponding row is removed from the table. Repeating this process, all biclusters in the table are systematically identified until the table becomes empty. CONCLUSIONS: This paper presents a novel biclustering algorithm for the identification of additive biclusters. Since it involves exhaustively testing combinations of genes and conditions, the additive biclusters can be found more readily.

  19. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Ju-Chun Hsu

    Full Text Available Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS. The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs. A total of 29,067 isotigs have putative homologues in the non-redundant (nr protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also

  20. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  1. Proteogenomic-based discovery of minor histocompatibility antigens with suitable features for immunotherapy of hematologic cancers.

    Science.gov (United States)

    Granados, D P; Rodenbrock, A; Laverdure, J-P; Côté, C; Caron-Lizotte, O; Carli, C; Pearson, H; Janelle, V; Durette, C; Bonneil, E; Roy, D C; Delisle, J-S; Lemieux, S; Thibault, P; Perreault, C

    2016-06-01

    Pre-clinical studies have shown that injection of allogeneic T cells primed against a single minor histocompatibility antigen (MiHA) could cure hematologic cancers (HC) without causing any toxicity to the host. However, translation of this approach in humans has been hampered by the paucity of molecularly defined human MiHAs. Using a novel proteogenomic approach, we have analyzed cells from 13 volunteers and discovered a vast repertoire of MiHAs presented by the most common HLA haplotype in European Americans: HLA-A*02:01;B*44:03. Notably, out of >6000 MiHAs, we have identified a set of 39 MiHAs that share optimal features for immunotherapy of HCs. These 'optimal MiHAs' are coded by common alleles of genes that are preferentially expressed in hematopoietic cells. Bioinformatic modeling based on MiHA allelic frequencies showed that the 39 optimal MiHAs would enable MiHA-targeted immunotherapy of practically all HLA-A*02:01;B*44:03 patients. Further extension of this strategy to a few additional HLA haplotypes would allow treatment of almost all patients. PMID:26857467

  2. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes.

    Science.gov (United States)

    Elso, Colleen M; Chu, Edward P F; Alsayb, May A; Mackin, Leanne; Ivory, Sean T; Ashton, Michelle P; Bröer, Stefan; Silveira, Pablo A; Brodnicki, Thomas C

    2015-12-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying "natural" alleles in the human population is to engineer "artificial" alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  3. Accelerating Novel Candidate Gene Discovery in Neurogenetic Disorders via Whole-Exome Sequencing of Prescreened Multiplex Consanguineous Families

    Directory of Open Access Journals (Sweden)

    Anas M. Alazami

    2015-01-01

    Full Text Available Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS. We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  4. A roadmap for natural product discovery based on large-scale genomics and metabolomics

    Science.gov (United States)

    Actinobacteria encode a wealth of natural product biosynthetic gene clusters, whose systematic study is complicated by numerous repetitive motifs. By combining several metrics we developed a method for global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic ca...

  5. Toward the Development of a Virus-Cell-Based Assay for the Discovery of Novel Compounds against Human Immunodeficiency Virus Type 1

    OpenAIRE

    Adelson, Martin E.; Pacchia, Annmarie L.; Kaul, Malvika; Rando, Robert F.; Ron, Yacov; Peltz, Stuart W.; Dougherty, Joseph P.

    2003-01-01

    The emergence of human immunodeficiency virus type 1 (HIV-1) strains resistant to highly active antiretroviral therapy necessitates continued drug discovery for the treatment of HIV-1 infection. Most current drug discovery strategies focus upon a single aspect of HIV-1 replication. A virus-cell-based assay, which can be adapted to high-throughput screening, would allow the screening of multiple targets simultaneously. HIV-1-based vector systems mimic the HIV-1 life cycle without yielding repl...

  6. Internal Ribosome Entry Site-Based Bicistronic In Situ Reporter Assays for Discovery of Transcription-Targeted Lead Compounds.

    Science.gov (United States)

    Lang, Liwei; Ding, Han-Fei; Chen, Xiaoguang; Sun, Shi-Yong; Liu, Gang; Yan, Chunhong

    2015-07-23

    Although transgene-based reporter gene assays have been used to discover small molecules targeting expression of cancer-driving genes, the success is limited due to the fact that reporter gene expression regulated by incomplete cis-acting elements and foreign epigenetic environments does not faithfully reproduce chemical responses of endogenous genes. Here, we present an internal ribosome entry site-based strategy for bicistronically co-expressing reporter genes with an endogenous gene in the native gene locus, yielding an in situ reporter assay closely mimicking endogenous gene expression without disintegrating its function. This strategy combines the CRISPR-Cas9-mediated genome-editing tool with the recombinase-mediated cassette-exchange technology, and allows for rapid development of orthogonal assays for excluding false hits generated from primary screens. We validated this strategy by developing a screening platform for identifying compounds targeting oncogenic eIF4E, and demonstrated that the novel reporter assays are powerful in searching for transcription-targeted lead compounds with high confidence. PMID:26144883

  7. Gene-Set Local Hierarchical Clustering (GSLHC--A Gene Set-Based Approach for Characterizing Bioactive Compounds in Terms of Biological Functional Groups.

    Directory of Open Access Journals (Sweden)

    Feng-Hsiang Chung

    Full Text Available Gene-set-based analysis (GSA, which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA, which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap, an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap, in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website http://cloudr.ncu.edu.tw/gslhc/ contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases.

  8. Gene-Set Local Hierarchical Clustering (GSLHC)--A Gene Set-Based Approach for Characterizing Bioactive Compounds in Terms of Biological Functional Groups.

    Science.gov (United States)

    Chung, Feng-Hsiang; Jin, Zhen-Hua; Hsu, Tzu-Ting; Hsu, Chueh-Lin; Liu, Hsueh-Chuan; Lee, Hoong-Chien

    2015-01-01

    Gene-set-based analysis (GSA), which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA), which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap), an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap), in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC) for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website http://cloudr.ncu.edu.tw/gslhc/ contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases. PMID:26473729

  9. The Discovery of Quinoxaline-Based Metathesis Catalysts from Synthesis of Grazoprevir (MK-5172).

    Science.gov (United States)

    Williams, Michael J; Kong, Jongrock; Chung, Cheol K; Brunskill, Andrew; Campeau, Louis-Charles; McLaughlin, Mark

    2016-05-01

    Olefin metathesis (OM) is a reliable and practical synthetic methodology for challenging carbon-carbon bond formations. While existing catalysts can effect many of these transformations, the synthesis and development of new catalysts is essential to increase the application breadth of OM and to achieve improved catalyst activity. The unexpected initial discovery of a novel olefin metathesis catalyst derived from synthetic efforts toward the HCV therapeutic agent grazoprevir (MK-5172) is described. This initial finding has evolved into a class of tunable, shelf-stable ruthenium OM catalysts that are easily prepared and exhibit unique catalytic activity. PMID:27123552

  10. Combining knowledge discovery from databases (KDD) and case-based reasoning (CBR) to support diagnosis of medical images

    Science.gov (United States)

    Stranieri, Andrew; Yearwood, John; Pham, Binh

    1999-07-01

    The development of data warehouses for the storage and analysis of very large corpora of medical image data represents a significant trend in health care and research. Amongst other benefits, the trend toward warehousing enables the use of techniques for automatically discovering knowledge from large and distributed databases. In this paper, we present an application design for knowledge discovery from databases (KDD) techniques that enhance the performance of the problem solving strategy known as case- based reasoning (CBR) for the diagnosis of radiological images. The problem of diagnosing the abnormality of the cervical spine is used to illustrate the method. The design of a case-based medical image diagnostic support system has three essential characteristics. The first is a case representation that comprises textual descriptions of the image, visual features that are known to be useful for indexing images, and additional visual features to be discovered by data mining many existing images. The second characteristic of the approach presented here involves the development of a case base that comprises an optimal number and distribution of cases. The third characteristic involves the automatic discovery, using KDD techniques, of adaptation knowledge to enhance the performance of the case based reasoner. Together, the three characteristics of our approach can overcome real time efficiency obstacles that otherwise mitigate against the use of CBR to the domain of medical image analysis.

  11. AAV-Based Targeting Gene Therapy

    Directory of Open Access Journals (Sweden)

    Wenfang Shi

    2008-01-01

    Full Text Available Since the first parvovirus serotype AAV2 was isolated from human and used as a vector for gene therapy application, there have been significant progresses in AAV vector development. AAV vectors have been extensively investigated in gene therapy for a broad application. AAV vectors have been considered as the first choice of vector due to efficient infectivity, stable expression and non-pathogenicity. However, the untoward events in AAV mediated in vivo gene therapy studies proposed the new challenges for their further applications. Deep understanding of the viral life cycle, viral structure and replication, infection mechanism and efficiency of AAV DNA integration, in terms of contributing viral, host-cell factors and circumstances would promote to evaluate the advantages and disadvantages and provide more insightful information for the possible clinical applications. In this review, main effort will be focused on the recent progresses in gene delivery to the target cells via receptor-ligand interaction and DNA specific integration regulation. Furthermore AAV receptor and virus particle intracellular trafficking are also discussed.

  12. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    Science.gov (United States)

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis. PMID:26827652

  13. Perfused drop microfluidic device for brain slice culture-based drug discovery.

    Science.gov (United States)

    Liu, Jing; Pan, Liping; Cheng, Xuanhong; Berdichevsky, Yevgeny

    2016-06-01

    Living slices of brain tissue are widely used to model brain processes in vitro. In addition to basic neurophysiology studies, brain slices are also extensively used for pharmacology, toxicology, and drug discovery research. In these experiments, high parallelism and throughput are critical. Capability to conduct long-term electrical recording experiments may also be necessary to address disease processes that require protein synthesis and neural circuit rewiring. We developed a novel perfused drop microfluidic device for use with long term cultures of brain slices (organotypic cultures). Slices of hippocampus were placed into wells cut in polydimethylsiloxane (PDMS) film. Fluid level in the wells was hydrostatically controlled such that a drop was formed around each slice. The drops were continuously perfused with culture medium through microchannels. We found that viable organotypic hippocampal slice cultures could be maintained for at least 9 days in vitro. PDMS microfluidic network could be readily integrated with substrate-printed microelectrodes for parallel electrical recordings of multiple perfused organotypic cultures on a single MEA chip. We expect that this highly scalable perfused drop microfluidic device will facilitate high-throughput drug discovery and toxicology. PMID:27194028

  14. Increased complexity of gene structure and base composition in vertebrates

    Institute of Scientific and Technical Information of China (English)

    Ying Wu; Huizhong Yuan; Shengjun Tan; Jian-Qun Chen; Dacheng Tian; Haiwang Yang

    2011-01-01

    How the structure and base composition of genes changed with the evolution of vertebrates remains a puzzling question. Here we analyzed 895 orthologous protein-coding genes in six multicellular animals: human, chicken, zebrafish, sea squirt, fruit fly, and worm. Our analyses reveal that many gene regions, particularly intron and 3' UTR, gradually expanded throughout the evolution of vertebrates from their invertebrate ancestors, and that the number of exons per gene increased. Studies based on all protein-coding genes in each genome provide consistent results.We also find that GC-content increased in many gene regions (especially 5' UTR) in the evolution of endotherms, except in coding-exons.Analysis of individual genomes shows that 3′ UTR demonstrated stronger length and CC-content correlation with intron than 5' UTR, and gene with large intron in all six species demonstrated relatively similar GC-content. Our data indicates a great increase in complexity in vertebrate genes and we propose that the requirement for morphological and functional changes is probably the driving force behind the evolution of structure and base composition complexity in multicellular animal genes.

  15. Discovery and investigation of anticancer ruthenium-arene Schiff-base complexes via water-promoted combinatorial three-component assembly.

    Science.gov (United States)

    Chow, Mun Juinn; Licona, Cynthia; Yuan Qiang Wong, Daniel; Pastorin, Giorgia; Gaiddon, Christian; Ang, Wee Han

    2014-07-24

    The structural diversity of metal scaffolds makes them a viable alternative to traditional organic scaffolds for drug design. Combinatorial chemistry and multicomponent reactions, coupled with high-throughput screening, are useful techniques in drug discovery, but they are rarely used in metal-based drug design. We report the optimization and validation of a new combinatorial, metal-based, three-component assembly reaction for the synthesis of a library of 442 Ru-arene Schiff-base (RAS) complexes. These RAS complexes were synthesized in a one-pot, on-a-plate format using commercially available starting materials under aqueous conditions. The library was screened for their anticancer activity, and several cytotoxic lead compounds were identified. In particular, [(η6-1,3,5-triisopropylbenzene)RuCl(4-methoxy-N-(2-quinolinylmethylene)aniline)]Cl (4) displayed low micromolar IC50 values in ovarian cancers (A2780, A2780cisR), breast cancer (MCF7), and colorectal cancer (HCT116, SW480). The absence of p53 activation or changes in IC50 value between p53+/+ and p53-/- cells suggests that 4 and possibly the other lead compounds may act independently of the p53 tumor suppressor gene frequently mutated in cancer. PMID:25023617

  16. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    Science.gov (United States)

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system. PMID:25496401

  17. Using intron position conservation for homology-based gene prediction.

    Science.gov (United States)

    Keilwagen, Jens; Wenk, Michael; Erickson, Jessica L; Schattat, Martin H; Grau, Jan; Hartung, Frank

    2016-05-19

    Annotation of protein-coding genes is very important in bioinformatics and biology and has a decisive influence on many downstream analyses. Homology-based gene prediction programs allow for transferring knowledge about protein-coding genes from an annotated organism to an organism of interest.Here, we present a homology-based gene prediction program called GeMoMa. GeMoMa utilizes the conservation of intron positions within genes to predict related genes in other organisms. We assess the performance of GeMoMa and compare it with state-of-the-art competitors on plant and animal genomes using an extended best reciprocal hit approach. We find that GeMoMa often makes more precise predictions than its competitors yielding a substantially increased number of correct transcripts. Subsequently, we exemplarily validate GeMoMa predictions using Sanger sequencing. Finally, we use RNA-seq data to compare the predictions of homology-based gene prediction programs, and find again that GeMoMa performs well.Hence, we conclude that exploiting intron position conservation improves homology-based gene prediction, and we make GeMoMa freely available as command-line tool and Galaxy integration. PMID:26893356

  18. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    Directory of Open Access Journals (Sweden)

    Karla A Salazar

    Full Text Available In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  19. A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

    Directory of Open Access Journals (Sweden)

    Seitzer Phillip

    2012-11-01

    Full Text Available Abstract Background Discovery of functionally significant short, statistically overrepresented subsequence patterns (motifs in a set of sequences is a challenging problem in bioinformatics. Oftentimes, not all sequences in the set contain a motif. These non-motif-containing sequences complicate the algorithmic discovery of motifs. Filtering the non-motif-containing sequences from the larger set of sequences while simultaneously determining the identity of the motif is, therefore, desirable and a non-trivial problem in motif discovery research. Results We describe MotifCatcher, a framework that extends the sensitivity of existing motif-finding tools by employing random sampling to effectively remove non-motif-containing sequences from the motif search. We developed two implementations of our algorithm; each built around a commonly used motif-finding tool, and applied our algorithm to three diverse chromatin immunoprecipitation (ChIP data sets. In each case, the motif finder with the MotifCatcher extension demonstrated improved sensitivity over the motif finder alone. Our approach organizes candidate functionally significant discovered motifs into a tree, which allowed us to make additional insights. In all cases, we were able to support our findings with experimental work from the literature. Conclusions Our framework demonstrates that additional processing at the sequence entry level can significantly improve the performance of existing motif-finding tools. For each biological data set tested, we were able to propose novel biological hypotheses supported by experimental work from the literature. Specifically, in Escherichia coli, we suggested binding site motifs for 6 non-traditional LexA protein binding sites; in Saccharomyces cerevisiae, we hypothesize 2 disparate mechanisms for novel binding sites of the Cse4p protein; and in Halobacterium sp. NRC-1, we discoverd subtle differences in a general transcription factor (GTF binding site motif

  20. Climate Discovery: Integrating Research With Exhibit, Public Tours, K-12, and Web-based EPO Resources

    Science.gov (United States)

    Foster, S. Q.; Carbone, L.; Gardiner, L.; Johnson, R.; Russell, R.; Advisory Committee, S.; Ammann, C.; Lu, G.; Richmond, A.; Maute, A.; Haller, D.; Conery, C.; Bintner, G.

    2005-12-01

    The Climate Discovery Exhibit at the National Center for Atmospheric Research (NCAR) Mesa Lab provides an exciting conceptual outline for the integration of several EPO activities with other well-established NCAR educational resources and programs. The exhibit is organized into four topic areas intended to build understanding among NCAR's 80,000 annual visitors, including 10,000 school children, about Earth system processes and scientific methods contributing to a growing body of knowledge about climate and global change. These topics include: 'Sun-Earth Connections,' 'Climate Now,' 'Climate Past,' and 'Climate Future.' Exhibit text, graphics, film and electronic media, and interactives are developed and updated through collaborations between NCAR's climate research scientists and staff in the Office of Education and Outreach (EO) at the University Corporation for Atmospheric Research (UCAR). With funding from NCAR, paleoclimatologists have contributed data and ideas for a new exhibit Teachers' Guide unit about 'Climate Past.' This collection of middle-school level, standards-aligned lessons are intended to help students gain understanding about how scientists use proxy data and direct observations to describe past climates. Two NASA EPO's have funded the development of 'Sun-Earth Connection' lessons, visual media, and tips for scientists and teachers. Integrated with related content and activities from the NASA-funded Windows to the Universe web site, these products have been adapted to form a second unit in the Climate Discovery Teachers' Guide about the Sun's influence on Earth's climate. Other lesson plans, previously developed by on-going efforts of EO staff and NSF's previously-funded Project Learn program are providing content for a third Teachers' Guide unit on 'Climate Now' - the dynamic atmospheric and geological processes that regulate Earth's climate. EO has plans to collaborate with NCAR climatologists and computer modelers in the next year to develop

  1. Prediction of Tumor Outcome Based on Gene Expression Data

    Institute of Scientific and Technical Information of China (English)

    Liu Juan; Hitoshi Iba

    2004-01-01

    Gene expression microarray data can be used to classify tumor types. We proposed a new procedure to classify human tumor samples based on microarray gene expressions by using a hybrid supervised learning method called MOEA+WV (Multi-Objective Evolutionary Algorithm+Weighted Voting). MOEA is used to search for a relatively few subsets of informative genes from the high-dimensional gene space, and WV is used as a classification tool. This new method has been applied to predicate the subtypes of lymphoma and outcomes of medulloblastoma. The results are relatively accurate and meaningful compared to those from other methods.

  2. A Fluorescence Displacement Assay for Antidepressant Drug Discovery Based on Ligand-Conjugated Quantum Dots

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Jerry [Vanderbilt University; Tomlinson, Ian [Oak Ridge National Laboratory (ORNL); Warnement, Michael [Vanderbilt University; Iwamoto, Hideki [Vanderbilt University

    2011-01-01

    The serotonin (5-hydroxytryptamine, 5-HT) transporter (SERT) protein plays a central role in terminating 5-HT neurotransmission and is the most important therapeutic target for the treatment of major depression and anxiety disorders. We report an innovative, versatile, and target-selective quantum dot (QD) labeling approach for SERT in single Xenopus oocytes that can be adopted as a drug-screening platform. Our labeling approach employs a custom-made, QD-tagged indoleamine derivative ligand, IDT318, that is structurally similar to 5-HT and accesses the primary binding site with enhanced human SERT selectivity. Incubating QD-labeled oocytes with paroxetine (Paxil), a high-affinity SERT-specific inhibitor, showed a concentration- and time-dependent decrease in QD fluorescence, demonstrating the utility of our approach for the identification of SERT modulators. Furthermore, with the development of ligands aimed at other pharmacologically relevant targets, our approach may potentially form the basis for a multitarget drug discovery platform.

  3. Connectivity Map-based discovery of parbendazole reveals targetable human osteogenic pathway.

    Science.gov (United States)

    Brum, Andrea M; van de Peppel, Jeroen; van der Leije, Cindy S; Schreuders-Koedam, Marijke; Eijken, Marco; van der Eerden, Bram C J; van Leeuwen, Johannes P T M

    2015-10-13

    Osteoporosis is a common skeletal disorder characterized by low bone mass leading to increased bone fragility and fracture susceptibility. In this study, we have identified pathways that stimulate differentiation of bone forming osteoblasts from human mesenchymal stromal cells (hMSCs). Gene expression profiling was performed in hMSCs differentiated toward osteoblasts (at 6 h). Significantly regulated genes were analyzed in silico, and the Connectivity Map (CMap) was used to identify candidate bone stimulatory compounds. The signature of parbendazole matches the expression changes observed for osteogenic hMSCs. Parbendazole stimulates osteoblast differentiation as indicated by increased alkaline phosphatase activity, mineralization, and up-regulation of bone marker genes (alkaline phosphatase/ALPL, osteopontin/SPP1, and bone sialoprotein II/IBSP) in a subset of the hMSC population resistant to the apoptotic effects of parbendazole. These osteogenic effects are independent of glucocorticoids because parbendazole does not up-regulate glucocorticoid receptor (GR) target genes and is not inhibited by the GR antagonist mifepristone. Parbendazole causes profound cytoskeletal changes including degradation of microtubules and increased focal adhesions. Stabilization of microtubules by pretreatment with Taxol inhibits osteoblast differentiation. Parbendazole up-regulates bone morphogenetic protein 2 (BMP-2) gene expression and activity. Cotreatment with the BMP-2 antagonist DMH1 limits, but does not block, parbendazole-induced mineralization. Using the CMap we have identified a previously unidentified lineage-specific, bone anabolic compound, parbendazole, which induces osteogenic differentiation through a combination of cytoskeletal changes and increased BMP-2 activity. PMID:26420877

  4. A Collaborative Self-Governing Privacy-Preserving Wireless Sensor Network Architecture Based on Location Optimization for Dynamic Service Discovery in MANET Environment

    OpenAIRE

    Cong Gao; Jianfeng Ma; Shangwei Zhang

    2015-01-01

    Due to the characteristics of a MANET, none of the existing solutions for service discovery work well in decentralized mobile environments. In this paper, we propose a collaborative self-governing privacy-preserving wireless sensor network architecture to address the issue of service discovery in MANET environment. The proposed architecture is able to dynamically adjust the working modes between directory-based and directory-less modes according to the network status of a MANET. The dynamic n...

  5. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens.

    Science.gov (United States)

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Del Vecchio, Lucia; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Battaglia, Giovanni Giorgio; Garozzo, Maurizio; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J; Novak, Jan; Julian, Bruce A; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P; Gharavi, Ali G

    2014-11-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six new genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geospatial distribution of risk alleles is highly suggestive of multi-locus adaptation, and genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  6. The discovery of archaea origin phosphomannomutase in algae based on the algal transcriptome

    Institute of Scientific and Technical Information of China (English)

    FENG Yanjing; CHI Shan; LIU Cui; CHEN Shengping; YU Jun; WANG Xumin; LIU Tao

    2014-01-01

    Phosphomannomutase (PMM;EC 5.4.2.8) is an enzyme that catalyzes the interconversion reaction between mannose-6-phosphate and mannose-1-phosphate. However, its systematic molecular and functional in-vestigations in algae have not hitherto been reported. In this work, with the accomplishment of the 1 000 Plant Project (OneKP) in which more than 218 species of Chromista, including 19 marine phaeophytes, 22 marine rhodophytes, 171 chlorophytes, 5 cryptophytes, 4 haptophytes, and 5 glaucophytes were sequenced, we used a gene analysis method to analyze the PMM gene sequences in algae and confirm the existence of the PMM gene in the transcriptomic sequencing data of Rhodophyta and Ochrophyta. Our results showed that only one type of PMM with four conserved motifs exists in Chromista which is similar to human PMM. Moreover, the phylogenetic tree revealed that algae PMM possibly originated from archaea.

  7. Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics

    Directory of Open Access Journals (Sweden)

    Mueller Lukas N

    2008-12-01

    Full Text Available Abstract Background Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics. Results We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS. We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling. Conclusion The Corra computational framework leverages

  8. Tissue-specific laser microdissection of the Brassica napus funiculus improves gene discovery and spatial identification of biological processes

    Science.gov (United States)

    Chan, Ainsley C.; Khan, Deirdre; Girard, Ian J.; Becker, Michael G.; Millar, Jenna L.; Sytnik, David; Belmonte, Mark F.

    2016-01-01

    The three primary tissue systems of the funiculus each undergo unique developmental programs to support the growth and development of the filial seed. To understand the underlying transcriptional mechanisms that orchestrate development of the funiculus at the globular embryonic stage of seed development, we used laser microdissection coupled with RNA-sequencing to produce a high-resolution dataset of the mRNAs present in the epidermis, cortex, and vasculature of the Brassica napus (canola) funiculus. We identified 7761 additional genes in these tissues compared with the whole funiculus organ alone using this technology. Differential expression and enrichment analyses were used to identify several biological processes associated with each tissue system. Our data show that cell wall modification and lipid metabolism are prominent in the epidermis, cell growth and modification occur in the cortex, and vascular tissue proliferation and differentiation occur in the central vascular strand. We provide further evidence that each of the three tissue systems of the globular stage funiculus are involved in specific biological processes that all co-ordinate to support seed development. The identification of genes and gene regulators responsible for tissue-specific developmental processes of the canola funiculus now serves as a valuable resource for seed improvement research. PMID:27194740

  9. Gene expression analysis and SNP/InDel discovery to investigate yield heterosis of two rubber tree F1 hybrids.

    Science.gov (United States)

    Li, Dejun; Zeng, Rizhong; Li, Yan; Zhao, Manman; Chao, Jinquan; Li, Yu; Wang, Kai; Zhu, Lihuang; Tian, Wei-Min; Liang, Chengzhi

    2016-01-01

    As an important industrial material, natural rubber is mainly harvested from the rubber tree. Rubber tree breeding is inefficient, expensive and time-consuming, whereas marker-assisted selection is a feasible method for early selection of high-yield hybrids. We thus sequenced and analyzed the transcriptomes of two parent rubber trees (RRIM 600 and PR 107) and their most productive hybrids (RY 7-33-97 and RY 7-20-59) to understand their gene expression patterns and genetic variations including single nucleotide polymorphisms (SNPs) and small insertions/deletions (InDels). We discovered >31,000 genetic variations in 112,702 assembled unigenes. Our results showed that the higher yield in F1 hybrids was positively associated with their higher genome heterozygosity, which was further confirmed by genotyping 10 SNPs in 20 other varieties. We also showed that RY 7-33-97 and RY 7-20-59 were genetically closer to RRIM 600 and PR 107, respectively, in agreement with both their phenotypic similarities and gene expression profiles. After identifying ethylene- and jasmonic acid-responsive genes at the transcription level, we compared and analyzed the genetic variations underlying rubber biosynthesis and the jasmonic acid and ethylene pathways in detail. Our results suggest that genome-wide genetic variations play a substantive role in maintaining rubber tree heterosis. PMID:27108962

  10. Gene expression analysis and SNP/InDel discovery to investigate yield heterosis of two rubber tree F1 hybrids

    Science.gov (United States)

    Li, Dejun; Zeng, Rizhong; Li, Yan; Zhao, Manman; Chao, Jinquan; Li, Yu; Wang, Kai; Zhu, Lihuang; Tian, Wei-Min; Liang, Chengzhi

    2016-01-01

    As an important industrial material, natural rubber is mainly harvested from the rubber tree. Rubber tree breeding is inefficient, expensive and time-consuming, whereas marker-assisted selection is a feasible method for early selection of high-yield hybrids. We thus sequenced and analyzed the transcriptomes of two parent rubber trees (RRIM 600 and PR 107) and their most productive hybrids (RY 7-33-97 and RY 7-20-59) to understand their gene expression patterns and genetic variations including single nucleotide polymorphisms (SNPs) and small insertions/deletions (InDels). We discovered >31,000 genetic variations in 112,702 assembled unigenes. Our results showed that the higher yield in F1 hybrids was positively associated with their higher genome heterozygosity, which was further confirmed by genotyping 10 SNPs in 20 other varieties. We also showed that RY 7-33-97 and RY 7-20-59 were genetically closer to RRIM 600 and PR 107, respectively, in agreement with both their phenotypic similarities and gene expression profiles. After identifying ethylene- and jasmonic acid–responsive genes at the transcription level, we compared and analyzed the genetic variations underlying rubber biosynthesis and the jasmonic acid and ethylene pathways in detail. Our results suggest that genome-wide genetic variations play a substantive role in maintaining rubber tree heterosis. PMID:27108962

  11. A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.)

    OpenAIRE

    Srinivasan Ramamurthy; Xiao Yongli; Vadez Vincent; Deokar Amit A; Balaji Jayashree; Kashiwagi Junichi; Lekha Pazhamala; Hiremath Pavana J; Varshney Rajeev K; Gaur Pooran M; Siddique Kadambot HM; Town Christopher D; Hoisington David A

    2009-01-01

    Abstract Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of...

  12. Core Collection Based Backcrossing: An Efficient Approach for Breeding,Germplasm Enhacement and Gene Discovery

    Institute of Scientific and Technical Information of China (English)

    J.Z. Jia; R.H. Zhou; X.Y. Zhang; L. Zhang; Y.L. Li; J. Wang; X.Z. Liu; L.F. Gao; S.B. Liu

    2007-01-01

    @@ Plant germplasm underpins much of crop development. Millions of germplasm accessions have been collected and conserved ex situ, and the major challenge is now how to exploit and utilize this abundant resource.

  13. Higgs Discovery

    DEFF Research Database (Denmark)

    Sannino, Francesco

    2013-01-01

    via first principle lattice simulations with encouraging results. The new findings show that the recent naive claims made about new strong dynamics at the electroweak scale being disfavoured by the discovery of a not-so-heavy composite Higgs are unwarranted. I will then introduce the more speculative......I discuss the impact of the discovery of a Higgs-like state on composite dynamics starting by critically examining the reasons in favour of either an elementary or composite nature of this state. Accepting the standard model interpretation I re-address the standard model vacuum stability within a...... has been challenged by the discovery of a not-so-heavy Higgs-like state. I will therefore review the recent discovery \\cite{Foadi:2012bb} that the standard model top-induced radiative corrections naturally reduce the intrinsic non-perturbative mass of the composite Higgs state towards the desired...

  14. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    There is a large literature that investigates how homogenous securities traded on different markets incorporate new information (price discovery analysis). We extend this concept to the stochastic volatility process and investigate how markets contribute to the efficient stochastic volatility whi...

  15. Toxins and drug discovery.

    Science.gov (United States)

    Harvey, Alan L

    2014-12-15

    Components from venoms have stimulated many drug discovery projects, with some notable successes. These are briefly reviewed, from captopril to ziconotide. However, there have been many more disappointments on the road from toxin discovery to approval of a new medicine. Drug discovery and development is an inherently risky business, and the main causes of failure during development programmes are outlined in order to highlight steps that might be taken to increase the chances of success with toxin-based drug discovery. These include having a clear focus on unmet therapeutic needs, concentrating on targets that are well-validated in terms of their relevance to the disease in question, making use of phenotypic screening rather than molecular-based assays, and working with development partners with the resources required for the long and expensive development process. PMID:25448391

  16. An ensemble of SVM classifiers based on gene pairs.

    Science.gov (United States)

    Tong, Muchenxuan; Liu, Kun-Hong; Xu, Chungui; Ju, Wenbin

    2013-07-01

    In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets. PMID:23668348

  17. High-Throughput Discovery of Mutations in Tef Semi-Dwarfing Genes by Next-Generation Sequencing Analysis

    OpenAIRE

    Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R.; Bennetzen, Jeffrey L

    2012-01-01

    Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15–45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, wa...

  18. Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing

    Directory of Open Access Journals (Sweden)

    Power Deborah M

    2010-06-01

    Full Text Available Abstract Background The Antarctic clam, Laternula elliptica, is an infaunal stenothermal bivalve mollusc with a circumpolar distribution. It plays a significant role in bentho-pelagic coupling and hence has been proposed as a sentinel species for climate change monitoring. Previous studies have shown that this mollusc displays a high level of plasticity with regard to shell deposition and damage repair against a background of genetic homogeneity. The Southern Ocean has amongst the lowest present-day CaCO3 saturation rate of any ocean region, and is predicted to be among the first to become undersaturated under current ocean acidification scenarios. Hence, this species presents as an ideal candidate for studies into the processes of calcium regulation and shell deposition in our changing ocean environments. Results 454 sequencing of L. elliptica mantle tissue generated 18,290 contigs with an average size of 535 bp (ranging between 142 bp-5.591 kb. BLAST sequence similarity searching assigned putative function to 17% of the data set, with a significant proportion of these transcripts being involved in binding and potentially of a secretory nature, as defined by GO molecular function and biological process classifications. These results indicated that the mantle is a transcriptionally active tissue which is actively proliferating. All transcripts were screened against an in-house database of genes shown to be involved in extracellular matrix formation and calcium homeostasis in metazoans. Putative identifications were made for a number of classical shell deposition genes, such as tyrosinase, carbonic anhydrase and metalloprotease 1, along with novel members of the family 2 G-Protein Coupled Receptors (GPCRs. A membrane transport protein (SEC61 was also characterised and this demonstrated the utility of the clam sequence data as a resource for examining cold adapted amino acid substitutions. The sequence data contained 46,235 microsatellites and 13

  19. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley.

    Science.gov (United States)

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  20. False-Positive Rate Determination of Protein Target Discovery using a Covalent Modification- and Mass Spectrometry-Based Proteomics Platform

    Science.gov (United States)

    Strickland, Erin C.; Geer, M. Ariel; Hong, Jiyong; Fitzgerald, Michael C.

    2014-01-01

    Detection and quantitation of protein-ligand binding interactions is important in many areas of biological research. Stability of proteins from rates of oxidation (SPROX) is an energetics-based technique for identifying the proteins targets of ligands in complex biological mixtures. Knowing the false-positive rate of protein target discovery in proteome-wide SPROX experiments is important for the correct interpretation of results. Reported here are the results of a control SPROX experiment in which chemical denaturation data is obtained on the proteins in two samples that originated from the same yeast lysate, as would be done in a typical SPROX experiment except that one sample would be spiked with the test ligand. False-positive rates of 1.2-2.2 % and protein targets of the drug, manassantin A. The impact of ion purity in the tandem mass spectral analyses and of background oxidation on the false-positive rate of protein target discovery using SPROX is also discussed.

  1. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    Science.gov (United States)

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

  2. High-throughput discovery of mutations in tef semi-dwarfing genes by next-generation sequencing analysis.

    Science.gov (United States)

    Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R; Bennetzen, Jeffrey L

    2012-11-01

    Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15-45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, was planted and leaf materials were collected into 23 superpools. Two dwarfing candidate genes, homeologues of dw3 of sorghum and rht1 of wheat, were sequenced directly from each superpool with 454 technology, and 120 candidate mutations were identified. Out of 10 candidates tested, six independent mutations were validated by Sanger sequencing, including two predicted detrimental mutations in both dw3 homeologues with a potential to improve lodging resistance in tef through further breeding. This study demonstrates that high-throughput sequencing can identify potentially valuable mutations in under-studied plant species like tef and has provided mutant lines that can now be combined and tested in breeding programs for improved lodging resistance. PMID:22904035

  3. An Innovative Cell Microincubator for Drug Discovery Based on 3D Silicon Structures

    Directory of Open Access Journals (Sweden)

    Francesca Aredia

    2016-01-01

    Full Text Available We recently employed three-dimensional (3D silicon microstructures (SMSs consisting in arrays of 3 μm-thick silicon walls separated by 50 μm-deep, 5 μm-wide gaps, as microincubators for monitoring the biomechanical properties of tumor cells. They were here applied to investigate the in vitro behavior of HT1080 human fibrosarcoma cells driven to apoptosis by the chemotherapeutic drug Bleomycin. Our results, obtained by fluorescence microscopy, demonstrated that HT1080 cells exhibited a great ability to colonize the narrow gaps. Remarkably, HT1080 cells grown on 3D-SMS, when treated with the DNA damaging agent Bleomycin under conditions leading to apoptosis, tended to shrink, reducing their volume and mimicking the normal behavior of apoptotic cells, and were prone to leave the gaps. Finally, we performed label-free detection of cells adherent to the vertical silicon wall, inside the gap of 3D-SMS, by exploiting optical low coherence reflectometry using infrared, low power radiation. This kind of approach may become a new tool for increasing automation in the drug discovery area. Our results open new perspectives in view of future applications of the 3D-SMS as the core element of a lab-on-a-chip suitable for screening the effect of new molecules potentially able to kill tumor cells.

  4. Potential of Glutamate-Based Drug Discovery for Next Generation Antidepressants

    Directory of Open Access Journals (Sweden)

    Shigeyuki Chaki

    2015-09-01

    Full Text Available Recently, ketamine has been demonstrated to exert rapid-acting antidepressant effects in patients with depression, including those with treatment-resistant depression, and this discovery has been regarded as the most significant advance in drug development for the treatment of depression in over 50 years. To overcome unwanted side effects of ketamine, numerous approaches targeting glutamatergic systems have been vigorously investigated. For example, among agents targeting the NMDA receptor, the efficacies of selective GluN2B receptor antagonists and a low-trapping antagonist, as well as glycine site modulators such as GLYX-13 and sarcosine have been demonstrated clinically. Moreover, agents acting on metabotropic glutamate receptors, such as mGlu2/3 and mGlu5 receptors, have been proposed as useful approaches to mimicking the antidepressant effects of ketamine. Neural and synaptic mechanisms mediated through the antidepressant effects of ketamine have been being delineated, most of which indicate that ketamine improves abnormalities in synaptic transmission and connectivity observed in depressive states via the AMPA receptor and brain-derived neurotrophic factor-dependent mechanisms. Interestingly, some of the above agents may share some neural and synaptic mechanisms with ketamine. These studies should provide important insights for the development of superior pharmacotherapies for depression with more potent and faster onsets of actions.

  5. Automated Sample Preparation Platform for Mass Spectrometry-Based Plasma Proteomics and Biomarker Discovery

    Directory of Open Access Journals (Sweden)

    Vilém Guryča

    2014-03-01

    Full Text Available The identification of novel biomarkers from human plasma remains a critical need in order to develop and monitor drug therapies for nearly all disease areas. The discovery of novel plasma biomarkers is, however, significantly hampered by the complexity and dynamic range of proteins within plasma, as well as the inherent variability in composition from patient to patient. In addition, it is widely accepted that most soluble plasma biomarkers for diseases such as cancer will be represented by tissue leakage products, circulating in plasma at low levels. It is therefore necessary to find approaches with the prerequisite level of sensitivity in such a complex biological matrix. Strategies for fractionating the plasma proteome have been suggested, but improvements in sensitivity are often negated by the resultant process variability. Here we describe an approach using multidimensional chromatography and on-line protein derivatization, which allows for higher sensitivity, whilst minimizing the process variability. In order to evaluate this automated process fully, we demonstrate three levels of processing and compare sensitivity, throughput and reproducibility. We demonstrate that high sensitivity analysis of the human plasma proteome is possible down to the low ng/mL or even high pg/mL level with a high degree of technical reproducibility.

  6. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    Directory of Open Access Journals (Sweden)

    Dajeong Lim

    2014-01-01

    Full Text Available Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7 using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60 and dihydropyrimidine dehydrogenase (DPYD are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness.

  7. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua): Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs

    Science.gov (United States)

    Andreassen, Rune; Rangnes, Fredrik; Sivertsen, Maria; Chiang, Michelle; Tran, Michelle; Worren, Merete Molton

    2016-01-01

    Background Atlantic cod (Gadus morhua) is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs) are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs. Results The discovery analysis revealed 490 mature miRNAs (401 unique sequences) along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1—5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs. Conclusions The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we

  8. 基于知识发现的范例推理系统%Case-Based Reasoning System Based on Knowledge Discovery

    Institute of Scientific and Technical Information of China (English)

    倪志伟; 蔡庆生

    2003-01-01

    Nowadays the research and exploitation of the case-based system are getting more and more attention.Case-Based Reasoning (CBR) is a strategy for solving the object cases based on the source cases that are prompted bythe object ones. CBR is not only a psychological theory for human knowledge, but will be a new cornerstone of theintelligent computer system technology. The case-based system is adopted in more and more application fields in orderto obtain better results, especially in the fields with ill-defined and no expert knowledge. But there is a lot of knowl-edge required in CBR, and we are also faced with the same knowledge acquisition bottleneck as in the expert systems.Data Mining (DM) and Knowledge Discovery in Database (KDD) are just the most useful means to solve this kind ofproblem in order to make the knowledge acquisition more automated . In this paper, we discuss the data mining tech-nology in CBR, especially we raise knowledge discovery in case base (KDC) and discuss this concept in detail. Final-ly, the structure of CBR based on DM is put forward.

  9. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    Directory of Open Access Journals (Sweden)

    Terao Keiji

    2002-12-01

    Full Text Available Abstract Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies.

  10. Prediction on the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase based on gene expression programming.

    Science.gov (United States)

    Li, Yuqin; You, Guirong; Jia, Baoxiu; Si, Hongzong; Yao, Xiaojun

    2014-01-01

    Quantitative structure-activity relationships (QSAR) were developed to predict the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase via heuristic method (HM) and gene expression programming (GEP). The descriptors of 33 pyrrolidine derivatives were calculated by the software CODESSA, which can calculate quantum chemical, topological, geometrical, constitutional, and electrostatic descriptors. HM was also used for the preselection of 5 appropriate molecular descriptors. Linear and nonlinear QSAR models were developed based on the HM and GEP separately and two prediction models lead to a good correlation coefficient (R (2)) of 0.93 and 0.94. The two QSAR models are useful in predicting the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase during the discovery of new anticancer drugs and providing theory information for studying the new drugs. PMID:24971318

  11. Prediction on the Inhibition Ratio of Pyrrolidine Derivatives on Matrix Metalloproteinase Based on Gene Expression Programming

    Directory of Open Access Journals (Sweden)

    Yuqin Li

    2014-01-01

    Full Text Available Quantitative structure-activity relationships (QSAR were developed to predict the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase via heuristic method (HM and gene expression programming (GEP. The descriptors of 33 pyrrolidine derivatives were calculated by the software CODESSA, which can calculate quantum chemical, topological, geometrical, constitutional, and electrostatic descriptors. HM was also used for the preselection of 5 appropriate molecular descriptors. Linear and nonlinear QSAR models were developed based on the HM and GEP separately and two prediction models lead to a good correlation coefficient (R2 of 0.93 and 0.94. The two QSAR models are useful in predicting the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase during the discovery of new anticancer drugs and providing theory information for studying the new drugs.

  12. Hi-Fi SELEX: A High-Fidelity Digital-PCR Based Therapeutic Aptamer Discovery Platform.

    Science.gov (United States)

    Ouellet, Eric; Foley, Jonathan H; Conway, Edward M; Haynes, Charles

    2015-08-01

    Current technologies for aptamer discovery typically leverage the systematic evolution of ligands by exponential enrichment (SELEX) concept by recursively panning semi-combinatorial ssDNA or RNA libraries against a molecular target. The expectation is that this iterative selection process will be sufficiently stringent to identify a candidate pool of specific high-affinity aptamers. However, failure of this process to yield promising aptamers is common, due in part to (i) limitations in library designs, (ii) retention of non-specific aptamers during screening rounds, (iii) excessive accumulation of amplification artifacts, and (iv) the use of screening criteria (binding affinity) that does not reflect therapeutic activity. We report a new selection platform, High-Fidelity (Hi-Fi) SELEX, that introduces fixed-region blocking elements to safeguard the functional diversity of the library. The chemistry of the target-display surface and the composition of the equilibration solvent are engineered to strongly inhibit non-specific retention of aptamers. Partition efficiencies approaching 10(6) are thereby realized. Retained members are amplified in Hi-Fi SELEX by digital PCR in a manner that ensures both elimination of amplification artifacts and stoichiometric conversion of amplicons into the single-stranded library required for the next selection round. Improvements to aptamer selections are first demonstrated using human α-thrombin as the target. Three clinical targets (human factors IXa, X, and D) are then subjected to Hi-Fi SELEX. For each, rapid enrichment of ssDNA aptamers offering an order-nM mean equilibrium dissociation constant (Kd) is achieved within three selection rounds, as quantified by a new label-free qPCR assay reported here. Therapeutic candidates against factor D are identified. PMID:25727321

  13. Comparison of Sequencing Based CNV Discovery Methods Using Monozygotic Twin Quartets

    Science.gov (United States)

    Legault, Marc-André; Girard, Simon; Lemieux Perreault, Louis-Philippe; Rouleau, Guy A.; Dubé, Marie-Pierre

    2015-01-01

    Background The advent of high throughput sequencing methods breeds an important amount of technical challenges. Among those is the one raised by the discovery of copy-number variations (CNVs) using whole-genome sequencing data. CNVs are genomic structural variations defined as a variation in the number of copies of a large genomic fragment, usually more than one kilobase. Here, we aim to compare different CNV calling methods in order to assess their ability to consistently identify CNVs by comparison of the calls in 9 quartets of identical twin pairs. The use of monozygotic twins provides a means of estimating the error rate of each algorithm by observing CNVs that are inconsistently called when considering the rules of Mendelian inheritance and the assumption of an identical genome between twins. The similarity between the calls from the different tools and the advantage of combining call sets were also considered. Results ERDS and CNVnator obtained the best performance when considering the inherited CNV rate with a mean of 0.74 and 0.70, respectively. Venn diagrams were generated to show the agreement between the different algorithms, before and after filtering out familial inconsistencies. This filtering revealed a high number of false positives for CNVer and Breakdancer. A low overall agreement between the methods suggested a high complementarity of the different tools when calling CNVs. The breakpoint sensitivity analysis indicated that CNVnator and ERDS achieved better resolution of CNV borders than the other tools. The highest inherited CNV rate was achieved through the intersection of these two tools (81%). Conclusions This study showed that ERDS and CNVnator provide good performance on whole genome sequencing data with respect to CNV consistency across families, CNV breakpoint resolution and CNV call specificity. The intersection of the calls from the two tools would be valuable for CNV genotyping pipelines. PMID:25812131

  14. Discovery of Antischistosomal Drug Leads Based on Tetraazamacrocyclic Derivatives and Their Metal Complexes.

    Science.gov (United States)

    Khan, M O Faruk; Keiser, Jennifer; Amoyaw, P N A; Hossain, Mohammad F; Vargas, Mireille; Le, Justin G; Simpson, Natalie C; Roewe, Kimberly D; Freeman, TaRynn N Carder; Hasley, Travis R; Maples, Randall D; Archibald, Stephen J; Hubin, Timothy J

    2016-09-01

    Praziquantel (PZQ) is the only drug available for the treatment of schistosomiasis, and since its large-scale use might be associated with the onset of resistance, new antischistosomal drugs should be developed. A series of 26 synthetic tetraazamacrocyclic derivatives and their metal complexes were synthesized, characterized, and screened for antischistosomal activity by application of a phased screening program. The compounds were first screened against newly transformed schistosomula (NTS) of harvested Schistosoma mansoni cercariae, then against adult worms, and finally, in vivo using the mouse model of S. mansoni infection. At a concentration of 33 μM, incubation with a total of 12 compounds resulted in the mortality of NTS at the 62% to 100% level. Five of these showing 100% inhibition of viability of NTS at 10 μM were selected for further screening for determination of the 50 inhibitory concentrations (IC50s) against both NTS and adult worms. Against NTS, all 5 compounds showed IC50s comparable to the IC50 of the standard drug, PZQ (0.87 to 9.65 μM for the 5 compounds versus 2.20 μM for PZQ). Three of these, which are the bisquinoline derivative of cyclen and its Fe(2+) and Mn(2+) complexes, showed micromolar IC50s (1.62 μM, 1.34 μM, and 4.12 μM, respectively, versus 0.10 μM for PZQ) against adult worms. In vivo, the worm burden reductions were 12.3%, 88.4%, and 74.5%, respectively, at a single oral dose of 400 mg/kg of body weight. The Fe(2+) complex exhibited activity in vivo comparable to that of PZQ, pointing to the discovery of a novel drug lead for schistosomiasis. PMID:27324765

  15. Patients, evidence and genes: an exploration of GPs' perspectives on gene-based personalized nutrition advice

    NARCIS (Netherlands)

    Bouwman, L.I.; Molder, te H.F.M.; Hiddink, G.J.

    2008-01-01

    Background. Nutrigenomics science examines the response of individuals to food compounds using post-genomics technology. It is expected that in the future, personalized nutrition advice can be provided based on information about genetic make-up. Objectives. Gene-based personalized nutrition advice e

  16. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  17. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    Directory of Open Access Journals (Sweden)

    Bibby Kyle

    2011-03-01

    Full Text Available Abstract Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology (KO identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  18. Discovery of biaryls as RORγ inverse agonists by using structure-based design.

    Science.gov (United States)

    Enyedy, Istvan J; Powell, Noel A; Caravella, Justin; van Vloten, Kurt; Chao, Jianhua; Banerjee, Daliya; Marcotte, Douglas; Silvian, Laura; McKenzie, Andres; Hong, Victor Sukbong; Fontenot, Jason D

    2016-05-15

    RORγ plays a critical role in controlling a pro-inflammatory gene expression program in several lymphocyte lineages including T cells, γδ T cells, and innate lymphoid cells. RORγ-mediated inflammation has been linked to susceptibility to Crohn's disease, arthritis, and psoriasis. Thus inverse agonists of RORγ have the potential of modulating inflammation. Our goal was to optimize two RORγ inverse agonists: T0901317 from literature and 1 that we obtained from internal screening. We used information from internal X-ray structures to design two libraries that led to a new biaryl series. PMID:27080181

  19. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  20. Integrative Genomics-Based Discovery of Novel Regulators of the Innate Antiviral Response.

    Directory of Open Access Journals (Sweden)

    Robin van der Lee

    2015-10-01

    Full Text Available The RIG-I-like receptor (RLR pathway is essential for detecting cytosolic viral RNA to trigger the production of type I interferons (IFNα/β that initiate an innate antiviral response. Through systematic assessment of a wide variety of genomics data, we discovered 10 molecular signatures of known RLR pathway components that collectively predict novel members. We demonstrate that RLR pathway genes, among others, tend to evolve rapidly, interact with viral proteins, contain a limited set of protein domains, are regulated by specific transcription factors, and form a tightly connected interaction network. Using a Bayesian approach to integrate these signatures, we propose likely novel RLR regulators. RNAi knockdown experiments revealed a high prediction accuracy, identifying 94 genes among 187 candidates tested (~50% that affected viral RNA-induced production of IFNβ. The discovered antiviral regulators may participate in a wide range of processes that highlight the complexity of antiviral defense (e.g. MAP3K11, CDK11B, PSMA3, TRIM14, HSPA9B, CDC37, NUP98, G3BP1, and include uncharacterized factors (DDX17, C6orf58, C16orf57, PKN2, SNW1. Our validated RLR pathway list (http://rlr.cmbi.umcn.nl/, obtained using a combination of integrative genomics and experiments, is a new resource for innate antiviral immunity research.

  1. Toward Omics-Based, Systems Biomedicine, and Path and Drug Discovery Methodologies for Depression-Inflammation Research.

    Science.gov (United States)

    Maes, Michael; Nowak, Gabriel; Caso, Javier R; Leza, Juan Carlos; Song, Cai; Kubera, Marta; Klein, Hans; Galecki, Piotr; Noto, Cristiano; Glaab, Enrico; Balling, Rudi; Berk, Michael

    2016-07-01

    Meta-analyses confirm that depression is accompanied by signs of inflammation including increased levels of acute phase proteins, e.g., C-reactive protein, and pro-inflammatory cytokines, e.g., interleukin-6. Supporting the translational significance of this, a meta-analysis showed that anti-inflammatory drugs may have antidepressant effects. Here, we argue that inflammation and depression research needs to get onto a new track. Firstly, the choice of inflammatory biomarkers in depression research was often too selective and did not consider the broader pathways. Secondly, although mild inflammatory responses are present in depression, other immune-related pathways cannot be disregarded as new drug targets, e.g., activation of cell-mediated immunity, oxidative and nitrosative stress (O&NS) pathways, autoimmune responses, bacterial translocation, and activation of the toll-like receptor and neuroprogressive pathways. Thirdly, anti-inflammatory treatments are sometimes used without full understanding of their effects on the broader pathways underpinning depression. Since many of the activated immune-inflammatory pathways in depression actually confer protection against an overzealous inflammatory response, targeting these pathways may result in unpredictable and unwanted results. Furthermore, this paper discusses the required improvements in research strategy, i.e., path and drug discovery processes, omics-based techniques, and systems biomedicine methodologies. Firstly, novel methods should be employed to examine the intracellular networks that control and modulate the immune, O&NS and neuroprogressive pathways using omics-based assays, including genomics, transcriptomics, proteomics, metabolomics, epigenomics, immunoproteomics and metagenomics. Secondly, systems biomedicine analyses are essential to unravel the complex interactions between these cellular networks, pathways, and the multifactorial trigger factors and to delineate new drug targets in the cellular

  2. A contig-based strategy for the genome-wide discovery of microRNAs without complete genome resources.

    Directory of Open Access Journals (Sweden)

    Jun-Zhi Wen

    Full Text Available MicroRNAs (miRNAs are important regulators of many cellular processes and exist in a wide range of eukaryotes. High-throughput sequencing is a mainstream method of miRNA identification through which it is possible to obtain the complete small RNA profile of an organism. Currently, most approaches to miRNA identification rely on a reference genome for the prediction of hairpin structures. However, many species of economic and phylogenetic importance are non-model organisms without complete genome sequences, and this limits miRNA discovery. Here, to overcome this limitation, we have developed a contig-based miRNA identification strategy. We applied this method to a triploid species of edible banana (GCTCV-119, Musa spp. AAA group and identified 180 pre-miRNAs and 314 mature miRNAs, which is three times more than those were predicted by the available dataset-based methods (represented by EST+GSS. Based on the recently published miRNA data set of Musa acuminate, the recall rate and precision of our strategy are estimated to be 70.6% and 92.2%, respectively, significantly better than those of EST+GSS-based strategy (10.2% and 50.0%, respectively. Our novel, efficient and cost-effective strategy facilitates the study of the functional and evolutionary role of miRNAs, as well as miRNA-based molecular breeding, in non-model species of economic or evolutionary interest.

  3. Ethylene and fruit ripening: from illumination gas to the control of gene expression, more than a century of discoveries

    Directory of Open Access Journals (Sweden)

    Ana Lúcia Soares Chaves

    2006-01-01

    Full Text Available The effects of ethylene on plants have been recognized since the Nineteenth Century and it is widely known as the phytohormone responsible for fruit ripening and for its involvement in a number of plant growth and development processes. Elucidating the mechanisms involved in the ripening of climacteric fruit and the role that ethylene plays in this process have been central to fruit production and the improvement of fruit quality. The biochemistry, genetics and physiology of ripening has been extensively studied in economically important fruit crops and a considerable amount of information is available which ranges from the ethylene biosynthesis pathway to the mechanisms of perception, signaling and control of gene expression. However, there is still much to be discovered about these processes and the objective of this review is to present a brief historic account of how ethylene became the focus of fruit ripening research as well as the development and the state-of- art of these studies at both biochemical and genetic levels.

  4. De Novo Deep Transcriptome Analysis of Medicinal Plants for Gene Discovery in Biosynthesis of Plant Natural Products.

    Science.gov (United States)

    Han, R; Rai, A; Nakamura, M; Suzuki, H; Takahashi, H; Yamazaki, M; Saito, K

    2016-01-01

    Study on transcriptome, the entire pool of transcripts in an organism or single cells at certain physiological or pathological stage, is indispensable in unraveling the connection and regulation between DNA and protein. Before the advent of deep sequencing, microarray was the main approach to handle transcripts. Despite obvious shortcomings, including limited dynamic range and difficulties to compare the results from distinct experiments, microarray was widely applied. During the past decade, next-generation sequencing (NGS) has revolutionized our understanding of genomics in a fast, high-throughput, cost-effective, and tractable manner. By adopting NGS, efficiency and fruitful outcomes concerning the efforts to elucidate genes responsible for producing active compounds in medicinal plants were profoundly enhanced. The whole process involves steps, from the plant material sampling, to cDNA library preparation, to deep sequencing, and then bioinformatics takes over to assemble enormous-yet fragmentary-data from which to comb and extract information. The unprecedentedly rapid development of such technologies provides so many choices to facilitate the task, which can cause confusion when choosing the suitable methodology for specific purposes. Here, we review the general approaches for deep transcriptome analysis and then focus on their application in discovering biosynthetic pathways of medicinal plants that produce important secondary metabolites. PMID:27480681

  5. Sucrose ester based cationic liposomes as effective non-viral gene vectors for gene delivery.

    Science.gov (United States)

    Zhao, Yinan; Zhu, Jie; Zhou, Hengjun; Guo, Xin; Tian, Tian; Cui, Shaohui; Zhen, Yuhong; Zhang, Shubiao; Xu, Yuhong

    2016-09-01

    As sucrose esters (SEs) are natural and biodegradable excipients with excellent drug dissolution and drug absorption/permeation in controlled release systems, we firstly incorporated SE into liposomes for gene delivery in this article. A peptide-based lipid (CDO14), Gemini-based quaternary ammonium-based lipid (CTA14), and mono-head quaternary ammonium lipid (CPA14), and SE as helper lipid, were prepared into liposomes which could enhance the interactions between liposomes and pDNA. Most importantly, the liposomes with helper lipid SE showed higher transfection and lower cytotoxicity than those without SE in Hela and A549 cells. It was also found that the transfection efficiency increased with the increase of SE content. The selected liposome, CDO14/SE, was able to deliver siRNA against luciferase for silencing gene in lung tumors of mice, with little in vivo toxicity. The results convincingly demonstrated SEs could be highly desirable candidates for gene delivery systems. PMID:27232309

  6. PCR-based detection of gene transfer vectors: application to gene doping surveillance.

    Science.gov (United States)

    Perez, Irene C; Le Guiner, Caroline; Ni, Weiyi; Lyles, Jennifer; Moullier, Philippe; Snyder, Richard O

    2013-12-01

    Athletes who illicitly use drugs to enhance their athletic performance are at risk of being banned from sports competitions. Consequently, some athletes may seek new doping methods that they expect to be capable of circumventing detection. With advances in gene transfer vector design and therapeutic gene transfer, and demonstrations of safety and therapeutic benefit in humans, there is an increased probability of the pursuit of gene doping by athletes. In anticipation of the potential for gene doping, assays have been established to directly detect complementary DNA of genes that are top candidates for use in doping, as well as vector control elements. The development of molecular assays that are capable of exposing gene doping in sports can serve as a deterrent and may also identify athletes who have illicitly used gene transfer for performance enhancement. PCR-based methods to detect foreign DNA with high reliability, sensitivity, and specificity include TaqMan real-time PCR, nested PCR, and internal threshold control PCR. PMID:23912835

  7. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2015-01-01

    In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well as their...... central concepts and conceptualization of the entrepreneurial function. On this basis we discuss three central themes that cut across the four alternatives: process, uncertainty, and agency. These themes provide new foci for entrepreneurship research and can help to generate new research questions and...

  8. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. PMID:26115767

  9. In Vitro Assessment of the Inflammatory Breast Cancer Cell Line SUM 149: Discovery of 2 Single Nucleotide Polymorphisms in the RNase L Gene

    Directory of Open Access Journals (Sweden)

    Brandon T. Nokes, Heather E. Cunliffe, Bonnie LaFleur, David W. Mount, Robert B. Livingston, Bernard W. Futscher, Julie E. Lang

    2013-01-01

    Full Text Available Background: Inflammatory breast cancer (IBC is a rare, highly aggressive form of breast cancer. The mechanism of IBC carcinogenesis remains unknown. We sought to evaluate potential genetic risk factors for IBC and whether or not the IBC cell lines SUM149 and SUM190 demonstrated evidence of viral infection.Methods: We performed single nucleotide polymorphism (SNP genotyping for 2 variants of the ribonuclease (RNase L gene that have been correlated with the risk of prostate cancer due to a possible viral etiology. We evaluated dose-response to treatment with interferon-alpha (IFN-α; and assayed for evidence of the putative human mammary tumor virus (HMTV, which has been implicated in IBC in SUM149 cells. A bioinformatic analysis was performed to evaluate expression of RNase L in IBC and non-IBC.Results: 2 of 2 IBC cell lines were homozygous for RNase L common missense variants 462 and 541; whereas 2 of 10 non-IBC cell lines were homozygous positive for the 462 variant (p= 0.09 and 0 of 10 non-IBC cell lines were homozygous positive for the 541 variant (p = 0.015. Our real-time polymerase chain reaction (RT-PCR and Southern blot analysis for sequences of HMTV revealed no evidence of the putative viral genome.Conclusion: We discovered 2 SNPs in the RNase L gene that were homozygously present in IBC cell lines. The 462 variant was absent in non-IBC lines. Our discovery of these SNPs present in IBC cell lines suggests a possible biomarker for risk of IBC. We found no evidence of HMTV in SUM149 cells. A query of a panel of human IBC and non-IBC samples showed no difference in RNase L expression. Further studies of the RNase L 462 and 541 variants in IBC tissues are warranted to validate our in vitro findings.

  10. Discovery of potential new gene variants and inflammatory cytokine associations with fibromyalgia syndrome by whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Jinong Feng

    Full Text Available Fibromyalgia syndrome (FMS is a chronic musculoskeletal pain disorder affecting 2% to 5% of the general population. Both genetic and environmental factors may be involved. To ascertain in an unbiased manner which genes play a role in the disorder, we performed complete exome sequencing on a subset of FMS patients. Out of 150 nuclear families (trios DNA from 19 probands was subjected to complete exome sequencing. Since >80,000 SNPs were found per proband, the data were further filtered, including analysis of those with stop codons, a rare frequency (<2.5% in the 1000 Genomes database, and presence in at least 2/19 probands sequenced. Two nonsense mutations, W32X in C11orf40 and Q100X in ZNF77 among 150 FMS trios had a significantly elevated frequency of transmission to affected probands (p = 0.026 and p = 0.032, respectively and were present in a subset of 13% and 11% of FMS patients, respectively. Among 9 patients bearing more than one of the variants we have described, 4 had onset of symptoms between the ages of 10 and 18. The subset with the C11orf40 mutation had elevated plasma levels of the inflammatory cytokines, MCP-1 and IP-10, compared with unaffected controls or FMS patients with the wild-type allele. Similarly, patients with the ZNF77 mutation have elevated levels of the inflammatory cytokine, IL-12, compared with controls or patients with the wild type allele. Our results strongly implicate an inflammatory basis for FMS, as well as specific cytokine dysregulation, in at least 35% of our FMS cohort.

  11. Syntenic relationships among legumes revealed using a gene-based genetic linkage map of common bean (Phaseolus vulgaris L.).

    Science.gov (United States)

    McConnell, Melody; Mamidi, Sujan; Lee, Rian; Chikara, Shireen; Rossi, Monica; Papa, Roberto; McClean, Phillip

    2010-10-01

    Molecular linkage maps are an important tool for gene discovery and cloning, crop improvement, further genetic studies, studies on diversity and evolutionary history, and cross-species comparisons. Linkage maps differ in both the type of marker and type of population used. In this study, gene-based markers were used for mapping in a recombinant inbred (RI) population of Phaseolus vulgaris L. P. vulgaris, common dry bean, is an important food source, economic product, and model organism for the legumes. Gene-based markers were developed that corresponded to genes controlling mutant phenotypes in Arabidopsis thaliana, genes undergoing selection during domestication in maize, and genes that function in a biochemical pathway in A. thaliana. Sequence information, including introns and 3' UTR, was generated for over 550 genes in the two genotypes of P. vulgaris. Over 1,800 single nucleotide polymorphisms and indels were found, 300 of which were screened in the RI population. The resulting LOD 2.0 map is 1,545 cM in length and consists of 275 gene-based and previously mapped core markers. An additional 153 markers that mapped at LOD <1.0 were placed in genetic bins. By screening the parents of other mapping populations, it was determined that the markers were useful for other common Mesoamerican × Andean mapping populations. The location of the mapped genes relative to their homologs in Arabidopsis thaliana (At), Medicago truncatula (Mt), and Lotus japonicus (Lj) were determine by using a tblastx analysis with the current psedouchromosome builds for each of the species. While only short blocks of synteny were observed with At, large-scale macrosyntenic blocks were observed with Mt and Lj. By using Mt and Lj as bridging species, the syntenic relationship between the common bean and peanut was inferred. PMID:20607211

  12. Mining Individual Behavior Pattern Based on Semantic Knowledge Discovery of Trajectory

    OpenAIRE

    Ren, Min; Yang, Feng; Zhou, Guangchun; Wang, Haiping

    2015-01-01

    This paper attempts to mine the hidden individual behavior pattern from the raw users’ trajectory data. Based on DBSCAN, a novel spatio-temporal data clustering algorithm named Speed-based Clustering Algorithm was put forward to find slow-speed subtrajectories (i.e., stops) of the single trajectory that the user stopped for a longer time. The algorithm used maximal speed and minimal stopping time to compute the stops and introduced the quantile function to estimate the value of the parameter,...

  13. Opposition-Based Discrete PSO Using Natural Encoding for Classification Rule Discovery

    OpenAIRE

    Naveed Kazim Khan; Abdul Rauf Baig; Muhammad Amjad Iqbal

    2012-01-01

    In this paper we present a new Discrete Particle Swarm Optimization approach to induce rules from discrete data. The proposed algorithm, called Opposition‐ based Natural Discrete PSO (ONDPSO), initializes its population by taking into account the discrete nature of the data. Particles are encoded using a Natural Encoding scheme. Each member of the population updates its position iteratively on the basis of a newly designed position update rule. Opposition‐based learning is implemented in the ...

  14. Knowledge acquisition and discovery for the textual case-based cooking system WIKITAAABLE

    OpenAIRE

    Badra, Fadi; Cojan, Julien; Cordier, Amélie; Lieber, Jean; Meilender, Thomas; Mille, Alain; Molli, Pascal; Nauer, Emmanuel; Napoli, Amedeo; Skaf-Molli, Hala; Toussaint, Yannick

    2009-01-01

    International audience The textual case-based cooking systemWIKITAAABLE participates to the second Computer cooking contest (CCC). It is an extension of the TAAABLE system that has participated to the first CCC. WIKITAAABLE's architecture is composed of a semantic wiki used for the collaborative acquisition of knowledge (recipe, ontology, adaptation knowledge) and of a case-based inference engine using this knowledge for retrieving and adapting recipes. This architecture allows various mod...

  15. Discovery of single-nucleotide mutations in genes related to rice starch synthesis and herbicide resistance by using self-made CEL I extracts

    International Nuclear Information System (INIS)

    The foundation of CEL I, a specific nuclease isolated from celery, makes the detection of point mutations to be easy and robust and it is essential nowadays in TILLING. However, large amounts of CEL I are consumed in TILLING and its extraction process is time-consuming. Furthermore, the high cost both in isolation and application of commercial CEL I Kit is an albatross for scientists in developing countries. Herein is presented a rapid method for detection of single-nucleotide mutations in rice genes by using self-made CEL I extracts. After tests on mismatch cleavage activity of CEL I extracts at different extraction steps, it was proved that CEL I extracts after clarification and dialysis are sufficiently enriched in mismatch cleavage activity. By optimization of factors related to mismatch cleavage activity, we found that CEL I extract made by ourself showed same function for mismatch cleavage as the commercial CEL I and established a feasible and effective method for detecting point mutation. Understanding and manipulating genetic variation is paramount to elucidating gene function, identifying genes, breeding, and conserving natural diversity. The general applicability of CEL I makes it great potential for detecting and understanding genetic variation in rice. By the method of mutation detection we set up using self-made CEL I, we found single-nucleotide mutations of some rice genes, such as waxy, SSIIa (starch synthase IIa) and als (acetolactate synthase), related to rice starch synthesis or herbicide resistance. The single-based variation (T/G or A/G) were detected both in first intron of waxy gene and 8th extron of SSIIa gene. For als gene, we found the single-nucleotide mutation at the position about 700bp and 400bp in the 1.5kb fragment amplified from different varieties and M2 plants respectively. (author)

  16. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    Directory of Open Access Journals (Sweden)

    Richardson Annette C

    2008-07-01

    Full Text Available Abstract Background Kiwifruit (Actinidia spp. are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs. Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons. Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases and pathways (terpenoid biosynthesis is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia.

  17. An Efficient Grid Service Discovery Mechanism Based on the Locality Principle

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    With the explosion of services in grid environment, it's necessary to develop a mechanism which has the ability of discovering suitable grid services efficiently. This paper attempts to establish a layered resource management model based on the locality principle which classifies services into different domains and virtual organizations (VOs) according to their shared purposes. We propose an ontology-based search method applying the ontology theory for characterizing semantic information. In addition, we extend the UDDI in querying, storing, and so on. Simulation experiments have shown that our mechanism achieves higher performance in precision, recall and query response time.

  18. Cytochrome P450-based cancer gene therapy: current status.

    Science.gov (United States)

    Kan, On; Kingsman, Susan; Naylor, Stuart

    2002-12-01

    Results from a number of preclinical studies have demonstrated that a P450-based gene-directed enzyme prodrug therapy (GDEPT) strategy for the treatment of cancer is both safe and efficacious. This strategy has now moved forward into the clinic. At least two different approaches using different delivery methods (retroviral vector MetXia [Oxford BioMedica] and encapsulated P450 expressing cells), different cytochrome P450 isoforms (human CYP2B6 versus rat CYP2B1) and different prodrugs (cyclophosphamide [CPA] versus ifosfamide [IFA]) have concluded Phase I/II clinical trial with encouraging results. In the future, P450-based GDEPT can potentially be further enhanced by improved vectors for P450 gene delivery and disease-targeted promoters for focused gene expression at the target site. In addition, there is scope for developing synthetic P450s and their respective prodrugs to improve both enzyme kinetics and the profile of the active moiety. PMID:12517265

  19. Identification of Gene Modules Associated with Drought Response in Rice by Network-Based Analysis

    OpenAIRE

    Lida Zhang; Shunwu Yu; Kaijing Zuo; Lijun Luo; Kexuan Tang

    2012-01-01

    Understanding the molecular mechanisms that underlie plant responses to drought stress is challenging due to the complex interplay of numerous different genes. Here, we used network-based gene clustering to uncover the relationships between drought-responsive genes from large microarray datasets. We identified 2,607 rice genes that showed significant changes in gene expression under drought stress; 1,392 genes were highly intercorrelated to form 15 gene modules. These drought-responsive gene ...

  20. Gene expression-based classifications of fibroadenomas and phyllodes tumours of the breast.

    Science.gov (United States)

    Vidal, Maria; Peg, Vicente; Galván, Patricia; Tres, Alejandro; Cortés, Javier; Ramón y Cajal, Santiago; Rubio, Isabel T; Prat, Aleix

    2015-06-01

    Fibroepithelial tumors (FTs) of the breast are a heterogeneous group of lesions ranging from fibroadenomas (FAD) to phyllodes tumors (PT) (benign, borderline, malignant). Further understanding of their molecular features and classification might be of clinical value. In this study, we analysed the expression of 105 breast cancer-related genes, including the 50 genes of the PAM50 intrinsic subtype predictor and 12 genes of the Claudin-low subtype predictor, in a panel of 75 FTs (34 FADs, 5 juvenile FADs, 20 benign PTs, 5 borderline PTs and 11 malignant PTs) with clinical follow-up. In addition, we compared the expression profiles of FTs with those of 14 normal breast tissues and 49 primary invasive ductal carcinomas (IDCs). Our results revealed that the levels of expression of all breast cancer-related genes can discriminate the various groups of FTs, together with normal breast tissues and IDCs (False Discovery Rate genes (e.g. CCNB1 and MKI67) and mesenchymal/epithelial-related (e.g. CLDN3 and EPCAM) genes were found to be most discriminative. As expected, FADs showed the highest and lowest expression of epithelial- and proliferation-related genes, respectively, whereas malignant PTs showed the opposite expression pattern. Interestingly, the overall profile of benign PTs was found more similar to FADs and normal breast tissues than the rest of tumours, including juvenile FADs. Within the dataset of IDCs and normal breast tissues, the vast majority of FADs, juvenile FADs, benign PTs and borderline PTs were identified as Normal-like by intrinsic breast cancer subtyping, whereas 7 (63.6%) and 3 (27.3%) malignant PTs were identified as Claudin-low and Basal-like, respectively. Finally, we observed that the previously described PAM50 risk of relapse prognostic score better predicted outcome in FTs than the morphological classification, even within PTs-only. Our results suggest that classification of FTs using gene expression-based data is feasible and might provide

  1. Discovery and characterization of the first non-coding RNA that regulates gene expression,micF RNA:A historical perspective

    Institute of Scientific and Technical Information of China (English)

    Nicholas; Delihas

    2015-01-01

    The first evidence that RNA can function as a regulator of gene expression came from experiments with prokaryotes in the 1980 s. It was shown that Escherichia coli micF isan independent gene,has its own promoter,and encodes a small non-coding RNA that base pairs with and inhibits translation of a target messenger RNA in response to environmental stress conditions. The mic F RNA was isolated,sequenced and shown to be a primary transcript. In vitro experiments showed binding to the target ompF mR NA. Secondary structure probing revealed an imperfect micF RNA/ompF RNA duplex interaction and the presence of a non-canonical base pair. Several transcription factors,including OmpR,regulate micF transcription in response to environmental factors. micF has also been found in other bacterial species,however,recently Gerhart Wagner and J?rg Vogel showed pleiotropic effects and found micF inhibits expression of multiple target mR NAs; importantly,one is the global regulatory gene lrp. In addition,micF RNA was found to interact with its targets in different ways; it either inhibits ribosome binding or induces degradation of the message. Thus the concept and initial experimental evidence that RNA can regulate gene expression was born with prokaryotes.

  2. Representation Discovery using Harmonic Analysis

    CERN Document Server

    Mahadevan, Sridhar

    2008-01-01

    Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu

  3. Detection of Gene Interactions Based on Syntactic Relations

    Directory of Open Access Journals (Sweden)

    Mi-Young Kim

    2008-01-01

    Full Text Available Interactions between proteins and genes are considered essential in the description of biomolecular phenomena, and networks of interactions are applied in a system's biology approach. Recently, many studies have sought to extract information from biomolecular text using natural language processing technology. Previous studies have asserted that linguistic information is useful for improving the detection of gene interactions. In particular, syntactic relations among linguistic information are good for detecting gene interactions. However, previous systems give a reasonably good precision but poor recall. To improve recall without sacrificing precision, this paper proposes a three-phase method for detecting gene interactions based on syntactic relations. In the first phase, we retrieve syntactic encapsulation categories for each candidate agent and target. In the second phase, we construct a verb list that indicates the nature of the interaction between pairs of genes. In the last phase, we determine direction rules to detect which of two genes is the agent or target. Even without biomolecular knowledge, our method performs reasonably well using a small training dataset. While the first phase contributes to improve recall, the second and third phases contribute to improve precision. In the experimental results using ICML 05 Workshop on Learning Language in Logic (LLL05 data, our proposed method gave an F-measure of 67.2% for the test data, significantly outperforming previous methods. We also describe the contribution of each phase to the performance.

  4. Opposition-Based Discrete PSO Using Natural Encoding for Classification Rule Discovery

    Directory of Open Access Journals (Sweden)

    Naveed Kazim Khan

    2012-11-01

    Full Text Available In this paper we present a new Discrete Particle Swarm Optimization approach to induce rules from discrete data. The proposed algorithm, called Opposition‐ based Natural Discrete PSO (ONDPSO, initializes its population by taking into account the discrete nature of the data. Particles are encoded using a Natural Encoding scheme. Each member of the population updates its position iteratively on the basis of a newly designed position update rule. Opposition‐based learning is implemented in the optimization process. The encoding scheme and position update rule used by the algorithm allows individual terms corresponding to different attributes within the rule’s antecedent to be a disjunction of the values of those attributes. The performance of the proposed algorithm is evaluated against seven different datasets using a tenfold testing scheme. The achieved median accuracy is compared against various evolutionary and non‐evolutionary classification techniques. The algorithm produces promising results by creating highly accurate and precise rules for each dataset.

  5. Structure-based discovery of inhibitors of the YycG histidine kinase

    DEFF Research Database (Denmark)

    Qin, X.; Zhang, J.; Xu, B.;

    2006-01-01

    inhibitors of YycG histidine kinase thus are of potential value as leads for developing new antibiotics against infecting staphylococci. The structure-based virtual screening (SBVS) technology can be widely used in screening potential inhibitors of other bacterial TCSs, since it is more rapid and efficacious...... resistance to many conventional antibiotics and often results in chronic infection. It has an urgent need to design novel antibiotics against staphylococci infections, especially those can kill cells embedded in biofilm. RESULTS: In this report, a series of novel inhibitors of the histidine kinase (HK) Yyc......G protein of S. epidermidis were discovered first using structure-based virtual screening (SBVS) from a small molecular lead-compound library, followed by experimental validation. Of the 76 candidates derived by SBVS targeting of the homolog model of the YycG HATPase_c domain of S. epidermidis, seven...

  6. Understanding Idiopathic Interstitial Pneumonia: A Gene-Based Review of Stressed Lungs

    OpenAIRE

    van Moorsel, Coline H. M.; Thijs W. Hoffman; van Batenburg, Aernoud A.; Dymph Klay; van der Vis, Joanne J.; Grutters, Jan C.

    2015-01-01

    Pulmonary fibrosis is the main cause of severe morbidity and mortality in idiopathic interstitial pneumonias (IIP). In the past years, there has been major progress in the discovery of genetic factors that contribute to disease. Genes with highly penetrant mutations or strongly predisposing common risk alleles have been identified in familial and sporadic IIP. This review summarizes genes harbouring causative rare mutations and replicated common predisposing alleles. To date, rare mutations i...

  7. Systematically characterizing and prioritizing chemosensitivity related gene based on Gene Ontology and protein interaction network

    Directory of Open Access Journals (Sweden)

    Chen Xin

    2012-10-01

    Full Text Available Abstract Background The identification of genes that predict in vitro cellular chemosensitivity of cancer cells is of great importance. Chemosensitivity related genes (CRGs have been widely utilized to guide clinical and cancer chemotherapy decisions. In addition, CRGs potentially share functional characteristics and network features in protein interaction networks (PPIN. Methods In this study, we proposed a method to identify CRGs based on Gene Ontology (GO and PPIN. Firstly, we documented 150 pairs of drug-CCRG (curated chemosensitivity related gene from 492 published papers. Secondly, we characterized CCRGs from the perspective of GO and PPIN. Thirdly, we prioritized CRGs based on CCRGs’ GO and network characteristics. Lastly, we evaluated the performance of the proposed method. Results We found that CCRG enriched GO terms were most often related to chemosensitivity and exhibited higher similarity scores compared to randomly selected genes. Moreover, CCRGs played key roles in maintaining the connectivity and controlling the information flow of PPINs. We then prioritized CRGs using CCRG enriched GO terms and CCRG network characteristics in order to obtain a database of predicted drug-CRGs that included 53 CRGs, 32 of which have been reported to affect susceptibility to drugs. Our proposed method identifies a greater number of drug-CCRGs, and drug-CCRGs are much more significantly enriched in predicted drug-CRGs, compared to a method based on the correlation of gene expression and drug activity. The mean area under ROC curve (AUC for our method is 65.2%, whereas that for the traditional method is 55.2%. Conclusions Our method not only identifies CRGs with expression patterns strongly correlated with drug activity, but also identifies CRGs in which expression is weakly correlated with drug activity. This study provides the framework for the identification of signatures that predict in vitro cellular chemosensitivity and offers a valuable

  8. Pharmacophore-based discovery of FXR agonists. Part I: Model development and experimental validation

    OpenAIRE

    Schuster, Daniela; Markt, Patrick; Grienke, Ulrike; Mihaly-Bison, Judit; Binder, Markus; Noha, Stefan M.; Rollinger, Judith M.; Stuppner, Hermann; Bochkov, Valery N.; Wolber, Gerhard

    2011-01-01

    The farnesoid X receptor (FXR) is involved in glucose and lipid metabolism regulation, which makes it an attractive target for the metabolic syndrome, dyslipidemia, atherosclerosis, and type 2 diabetes. In order to find novel FXR agonists, a structure-based pharmacophore model collection was developed and theoretically evaluated against virtual databases including the ChEMBL database. The most suitable models were used to screen the National Cancer Institute (NCI) database. Biological evaluat...

  9. Structure-Based DNA-Targeting Strategies with Small Molecule Ligands for Drug Discovery

    OpenAIRE

    Sheng, Jia; Gan, Jianhua; Huang, Zhen

    2013-01-01

    Nucleic acids are the molecular targets of many clinical anticancer drugs. However, compared with proteins, nucleic acids have traditionally attracted much less attention as drug targets in structure-based drug design, partially because limited structural information of nucleic acids complexed with potential drugs is available. Over the past several years, enormous progresses in nucleic acid crystallization, heavy-atom derivatization, phasing, and structural biology have been made. Many compl...

  10. An Application Layer Framework for Location- based Service Discovery and Provisioning for Mobile Devices

    OpenAIRE

    Gopinath, Sunil

    2001-01-01

    There has been a tremendous rise in the use of Wireless Application Protocol (WAP) services for cellular telephones. Such services include electronic mail, printing, fax delivery, and weather reports. But, current services are limited both in type and nature. Today, mobile telephone users need access to more dynamic, location-based, distributed services that include both hardware resources, like printers and computers, and software services, like application software. Problems due to mobil...

  11. Mass Spectrometry-based Proteomics and Peptidomics for Systems Biology and Biomarker Discovery

    OpenAIRE

    Cunningham, Robert; Ma, Di; Li, Lingjun

    2012-01-01

    The scientific community has shown great interest in the field of mass spectrometry-based proteomics and peptidomics for its applications in biology. Proteomics technologies have evolved to produce large datasets of proteins or peptides involved in various biological and disease progression processes producing testable hypothesis for complex biological questions. This review provides an introduction and insight to relevant topics in proteomics and peptidomics including biological material sel...

  12. From Raw Data to Biological Discoveries: A Computational Analysis Pipeline for Mass Spectrometry-Based Proteomics

    Science.gov (United States)

    Lavallée-Adam, Mathieu; Park, Sung Kyu Robin; Martínez-Bartolomé, Salvador; He, Lin; Yates, John R.

    2015-11-01

    In the last two decades, computational tools for mass spectrometry-based proteomics data analysis have evolved from a few stand-alone software solutions serving specific goals, such as the identification of amino acid sequences based on mass spectrometry spectra, to large-scale complex pipelines integrating multiple computer programs to solve a collection of problems. This software evolution has been mostly driven by the appearance of novel technologies that allowed the community to tackle complex biological problems, such as the identification of proteins that are differentially expressed in two samples under different conditions. The achievement of such objectives requires a large suite of programs to analyze the intricate mass spectrometry data. Our laboratory addresses complex proteomics questions by producing and using algorithms and software packages. Our current computational pipeline includes, among other things, tools for mass spectrometry raw data processing, peptide and protein identification and quantification, post-translational modification analysis, and protein functional enrichment analysis. In this paper, we describe a suite of software packages we have developed to process mass spectrometry-based proteomics data and we highlight some of the new features of previously published programs as well as tools currently under development.

  13. Discovery of potent nitrotriazole-based antitrypanosomal agents: In vitro and in vivo evaluation.

    Science.gov (United States)

    Papadopoulou, Maria V; Bloomer, William D; Rosenzweig, Howard S; O'Shea, Ivan P; Wilkinson, Shane R; Kaiser, Marcel; Chatelain, Eric; Ioset, Jean-Robert

    2015-10-01

    3-Nitro-1H-1,2,4-triazole- and 2-nitro-1H-imidazole-based amides with an aryloxy-phenyl core were synthesized and evaluated as antitrypanosomal agents. All 3-nitrotriazole-based derivatives were extremely potent anti-Trypanosoma cruzi agents at sub nM concentrations and exhibited a high degree of selectivity for the parasite. The 2-nitroimidazole analogs were only moderately active against T. cruzi amastigotes and exhibited low selectivity. Both types of compound were active against Leishmania donovani axenic amastigotes with excellent selectivity for the parasite, whereas three 2-nitroimidazole-based analogs were also moderately active against infected macrophages. However, no compound demonstrated selective activity against Trypanosoma brucei rhodesiense. The most potent in vitro anti-T. cruzi compounds were tested in an acute murine model and reduced the parasites to an undetectable level after five days of treatment at 13 mg/kg/day. Such compounds are potential inhibitors of T. cruzi CYP51 and, being excellent substrates for the type I nitroreductase (NTR) which is specific to trypanosomatids, work as prodrugs and constitute a new generation of effective and more affordable antitrypanosomal agents. PMID:26344593

  14. Discovery and molecular mapping of a new gene conferring resistance to stem rust, Sr53, derived from Aegilops geniculata and characterization of spontaneous translocation stocks with reduced alien chromatin

    Science.gov (United States)

    This study reports the discovery and molecular mapping of a resistance gene effective against stem rust races RKQQC and TTKSK (Ug99) derived from Aegilops geniculata (2n=4x=28, UgUgMgMg). Two populations from the crosses TA5599 (T5DL-5MgL.5MgS)/TA3809 (ph1b mutant in Chinese Spring background) and T...

  15. Design and Implementation of Visual Dynamic Display Software of Gene Expression Based on GTK

    Institute of Scientific and Technical Information of China (English)

    JIANG Wei; MENG Fanjiang; LI Yong; YU Xiao

    2009-01-01

    The paper presented an implement method for a dynamic gene expression display software based on the GTK. This method established the dynamic presentation system of gene expression which according to gene expression data from gene chip hybridize at different time, adopted a linearity combination model and Pearson correlation coefficient algorithm. The system described the gene expression changes in graphic form, the gene expression changes with time and the changes in characteristics of the gene expression, also the changes in relations of the gene expression and regulation relationships among genes. The system also provided an integrated platform for analysis on gene chips data, especially for the research on the network of gene regulation.

  16. The Development of Learning Devices Based Guided Discovery Model to Improve Understanding Concept and Critical Thinking Mathematically Ability of Students at Islamic Junior High School of Medan

    Science.gov (United States)

    Yuliani, Kiki; Saragih, Sahat

    2015-01-01

    The purpose of this research was to: 1) development of learning devices based guided discovery model in improving of understanding concept and critical thinking mathematically ability of students at Islamic Junior High School; 2) describe improvement understanding concept and critical thinking mathematically ability of students at MTs by using…

  17. Fragment-hopping-based discovery of a novel chemical series of proto-oncogene PIM-1 kinase inhibitors.

    Directory of Open Access Journals (Sweden)

    Gustavo Saluste

    Full Text Available A new chemical series, triazolo[4,5-b]pyridines, has been identified as an inhibitor of PIM-1 by a chemotype hopping strategy based on a chemically feasible fragment database. In this case, structure-based virtual screening and in silico chemogenomics provide added value to the previously reported strategy of prioritizing among proposed novel scaffolds. Pairwise comparison between compound 3, recently discontinued from Phase I clinical trials, and molecule 8, bearing the selected novel scaffold, shows that the primary activities are similar (IC(50 in the 20 to 150 nM range. At the same time, some ADME properties (for example, an increase of more than 45% in metabolic stability in human liver microsomes and the off-target selectivity (for example, an increase of more than 2 log units in IC(50vs. FLT3 are improved, and the intellectual property (IP position is enhanced. The discovery of a reliable starting point that fulfills critical criteria for a plausible medicinal chemistry project is demonstrated in this prospective study.

  18. Multiclass cancer classification based on gene expression comparison

    OpenAIRE

    Yang Sitan; Naiman Daniel Q.

    2014-01-01

    As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analyses, microarray-based cancer classification comprising multiple discriminatory molecular markers is an emerging trend. Such multiclass classification problems pose new methodological and computational challenges for developing novel and effective statistical approaches. In this paper, we introduce a new approach for classifying multiple disease states associated with cancer based on gene expre...

  19. Abiotic Methane in Land-Based Serpentinized Peridotites: New Discoveries and Isotope Surprises

    Science.gov (United States)

    Whiticar, M. J.; Etiope, G.

    2014-12-01

    Until 2008, abiotic methane in land-based serpentinized ultramafic rocks was documented (including gas and C- and H- isotope compositions) only at sites in Oman, Philippines, New Zealand and Turkey. Methane emanates from seeps and/or hyperalkaline water springs along faults and is associated with molecular hydrogen. These were considered to be very unusual and rare occurrences of gas. Now, methane is documented for peridotite-based springs or seeps (in ophiolites, orogenic massifs or intrusions) in US, Canada, Costa Rica, Greece, Italy, Japan, New Caledonia, Portugal, Spain and United Arab Emirates. Gas flux measurements are indicating that methane can also flux as invisible microseepages from the ground, through fractured peridotites, even far removed from seeps and springs. Methane C-isotope ratios range from -6 to -37 permil (VPDB) for dominantly abiotic methane. The more 13-C depleted values, e.g., California, are likely mixed with biotic gas (microbial and thermogenic gas). Methane H-isotope ratios cover a wide range from -118 to -333 permil (VSMOW). The combination of C- and H-isotopes clearly distinguish biotic from abiotic methane. Radiocarbon (14-C) analysis from bubbling seeps in Italian peridotites indicate that the methane is fossil (pMC old. The low temperatures of land-based peridotites (generally <100 °C at depths of 3-4 km) also constrain methane production. We discuss some hypotheses concerning gas generation in water vs. dry systems, i.e., in deeper, older, waters or in unsaturated rocks). We also discuss low vs. high temperatures, i.e., at the present-day low T conditions or at higher temperatures eventually occurring in the early stages of peridotite emplacement on land.

  20. Discovery of Potent VEGFR-2 Inhibitors based on Furopyrimidine and Thienopyrimidne Scaffolds as Cancer Targeting Agents

    Science.gov (United States)

    Aziz, Marwa A.; Serya, Rabah A. T.; Lasheen, Deena S.; Abdel-Aziz, Amal Kamal; Esmat, Ahmed; Mansour, Ahmed M.; Singab, Abdel Nasser B.; Abouzid, Khaled A. M.

    2016-04-01

    Vascular endothelial growth factor receptor-2 (VEGFR-2) plays a crucial role in cancer angiogenesis. In this study, a series of novel furo[2,3-d]pyrimidine and thieno[2,3-d]pyrimidine based-derivatives were designed and synthesized as VEGFR-2 inhibitors, in accordance to the structure activity relationship (SAR) studies of known type II VEGFR-2 inhibitors. The synthesized compounds were evaluated for their ability to in vitro inhibit VEGFR-2 kinase enzyme. Seven compounds (15b, 16c, 16e, 21a, 21b, 21c and 21e) demonstrated highly potent dose-related VEGFR-2 inhibition with IC50 values in nanomolar range, of which the thieno[2,3-d]pyrimidine based-derivatives (21b, 21c and 21e) exhibited IC50 values of 33.4, 47.0 and 21 nM respectively. Moreover, furo[2,3-d]pyrimidine-based derivative (15b) showed the strongest inhibition of human umbilical vein endothelial cells (HUVEC) proliferation with 99.5% inhibition at 10 μM concentration. Consistent with our in vitro findings, compounds (21b and 21e) orally administered at 5 and 10 mg/kg/day for 8 consecutive days demonstrated potent anticancer activity in Erhlich ascites carcinoma (EAC) solid tumor murine model. Such compounds blunted angiogenesis in EAC as evidenced by reduced percent microvessel via decreasing VEGFR-2 phosphorylation with subsequent induction of apoptotic machinery. Furthermore, Miles vascular permeability assay confirmed their antiangiogenic effects in vivo. Intriguingly, such compounds showed no obvious toxicity.

  1. Knowledge Discovery of Hydrocyclone s Circuit Based on SONFIS and SORST

    CERN Document Server

    Ghaffari, H O; Irannajad, M

    2009-01-01

    This study describes application of some approximate reasoning methods to analysis of hydrocyclone performance. In this manner, using a combining of Self Organizing Map (SOM), Neuro-Fuzzy Inference System (NFIS)-SONFIS- and Rough Set Theory (RST)-SORST-crisp and fuzzy granules are obtained. Balancing of crisp granules and non-crisp granules can be implemented in close-open iteration. Using different criteria and based on granulation level balance point (interval) or a pseudo-balance point is estimated. Validation of the proposed methods, on the data set of the hydrocyclone is rendered.

  2. Exposing USGS sample collections for broader discovery and access: collaboration between ScienceBase, IEDA:SESAR, and Paleobiology Database

    Science.gov (United States)

    Hsu, L.; Bristol, S.; Lehnert, K. A.; Arko, R. A.; Peters, S. E.; Uhen, M. D.; Song, L.

    2014-12-01

    The U.S. Geological Survey (USGS) is an exemplar of the need for improved cyberinfrastructure for its vast holdings of invaluable physical geoscience data. Millions of discrete paleobiological and geological specimens lie in USGS warehouses and at the Smithsonian Institution. These specimens serve as the basis for many geologic maps and geochemical databases, and are a potential treasure trove of new scientific knowledge. The extent of this treasure is virtually unknown and inaccessible outside a small group of paleogeoscientists and geochemists. A team from the USGS, the Integrated Earth Data Applications (IEDA) facility, and the Paleobiology Database (PBDB) are working to expose information on paleontological and geochemical specimens for discovery by scientists and citizens. This project uses existing infrastructure of the System for Earth Sample Registration (SESAR) and PBDB, which already contains much of the fundamental data schemas that are necessary to accommodate USGS records. The project is also developing a new Linked Data interface for the USGS National Geochemical Database (NGDB). The International Geo Sample Number (IGSN) is the identifier that links samples between all systems. For paleontological specimens, SESAR and PBDB will be the primary repositories for USGS records, with a data syncing process to archive records within the USGS ScienceBase system. The process began with mapping the metadata fields necessary for USGS collections to the existing SESAR and PBDB data structures, while aligning them with the Observations & Measurements and Darwin Core standards. New functionality needed in SESAR included links to a USGS locality registry, fossil classifications, a spatial qualifier attribution for samples with sensitive locations, and acknowledgement of data and metadata licensing. The team is developing a harvesting mechanism to periodically transfer USGS records from within PBDB and SESAR to ScienceBase. For the NGDB, the samples are being

  3. Prediction and discovery of new geothermal resources in the Great Basin: Multiple evidence of a large undiscovered resource base

    Science.gov (United States)

    Coolbaugh, M.F.; Raines, G.L.; Zehner, R.E.; Shevenell, L.; Williams, C.F.

    2006-01-01

    Geothermal potential maps by themselves cannot directly be used to estimate undiscovered resources. To address the undiscovered resource base in the Great Basin, a new and relatively quantitative methodology is presented. The methodology involves three steps, the first being the construction of a data-driven probabilistic model of the location of known geothermal systems using weights of evidence. The second step is the construction of a degree-of-exploration model. This degree-of-exploration model uses expert judgment in a fuzzy logic context to estimate how well each spot in the state has been explored, using as constraints digital maps of the depth to the water table, presence of the carbonate aquifer, and the location, depth, and type of drill-holes. Finally, the exploration model and the data-driven occurrence model are combined together quantitatively using area-weighted modifications to the weights-of-evidence equations. Using this methodology in the state of Nevada, the number of undiscovered geothermal systems with reservoir temperatures ???100??C is estimated at 157, which is 3.2 times greater than the 69 known systems. Currently, nine of the 69 known systems are producing electricity. If it is conservatively assumed that an additional nine for a total of 18 of the known systems will eventually produce electricity, then the model predicts 59 known and undiscovered geothermal systems are capable of producing electricity under current economic conditions in the state, a figure that is more than six times higher than the current number. Many additional geothermal systems could potentially become economic under improved economic conditions or with improved methods of reservoir stimulation (Enhanced Geothermal Systems).This large predicted geothermal resource base appears corroborated by recent grass-roots geothermal discoveries in the state of Nevada. At least two and possibly three newly recognized geothermal systems with estimated reservoir temperatures

  4. Discovery of Opinion Leader Community Via Multilayer Structure based Time-dividing Approach

    Directory of Open Access Journals (Sweden)

    Yan Liu

    2013-07-01

    Full Text Available With the advent of web 3.0, social network has become an important way to disclosure and spread the public sentiment. Opinion leaders play an important role in leading the direction of the public opinion. In this paper, due to the structure of the community in the network, we extracted the community by replies of each post in BBS, and we came up with an opinion leader community mining method based on level structure. In this way the communities each other have a better overlap result. Thus, communities can have more relations. Then, we analyzed the revolution of the communities after we got the structure of the opinion leader communities and we put forward a time-dividing method, and divided the whole communities into different pieces based on the character of the post and the duration of the time and we came up with the suitable measurement parameter to get the evolution result of the communities. Finally, experiments prove the efficiency of the opinion leader community mining method and we summarize the properties of the opinion leader community in revolution.

  5. Comparative analysis of bacterial essential and nonessential genes with Hurst exponent based on chaos game representation

    International Nuclear Information System (INIS)

    Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of essential genes with computational techniques. We use fractal theory approach to make comparative analysis of essential and nonessential genes in bacteria. The Hurst exponents of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations. It is found that for most analyzed bacteria, weak negative correlation exists between Hurst exponent and gene length. Moreover, essential genes generally differ from nonessential genes in their Hurst exponent. For genes of similar length, the average Hurst exponent of essential genes is smaller than that of nonessential genes. The results of our work reveal that gene Hurst exponent is very probably useful gene feature for the algorithm predicting essential genes

  6. Directionality based Location Discovery Scheme Using Beacon Nodes with Transmission Capabilities throughout Sensor Network

    Directory of Open Access Journals (Sweden)

    Qinli An

    2013-06-01

    Full Text Available In this paper, we propose a range-free localization scheme for wireless sensor networks (WSNs using four beacon nodes(BNs equipped with a directional antenna with special transmission capabilities for sending wireless beacon signals throughout the sensor network. Each beacon node rotates with a constant angular speed and broadcasts its angular bearings. A sensor node can determine its location by listening to wireless transmissions from the four fixed beacon nodes. The proposed method is based on an angle-of-arrival estimation technique that does not increase the complexity or cost of construction of the sensor nodes. We present error analysis and the best positions of beacon nodes in the proposed method. Numerical results, obtained by simulating several scenarios, show that the algorithm can reach a good level of convergence.

  7. Discovery of Anti-SARS Coronavirus Drug Based on Molecular Docking and Database Screening

    Institute of Scientific and Technical Information of China (English)

    CHEN,Hai-Feng(陈海峰); YAO,Jian-Hua(姚建华); SUN,Jing(孙晶); LI,Qiang(李强); LI,Feng(李丰); FAN,Bo-Tao(范波涛); YUAN,Shen-Gang(袁身刚)

    2004-01-01

    The active site of 3CL proteinase (3CLpro) for coronavirus was identified by comparing the crystal structures of human and porcine coronavirus. The inhibitor of the main protein of rhinovirus (Ag7088) could bind with 3CLpro of human coronavirus, then it was selected as the reference for molecular docking and database screening. The ligands from two databases were used to search potential lead structures with molecular docking. Several structures from natural products and ACD-SC databases were found to have lower binding free energy with 3CLpro than that of Ag7088. These structures have similar hydrophobicity to Ag7088. They have complementary electrostatic potential and hydrogen bond acceptor and donor with 3CLpro, showing that the strategy of anti-SARS drug design based on molecular docking and database screening is feasible.

  8. Discovery of a Novel 5-HT2A Inhibitor by Pharmacophore-based Virtual Screening

    Institute of Scientific and Technical Information of China (English)

    XIONG Zi-jun; DU Peng; LI Bian; XU Li-li; ZHEN Xue-chu; FU Wei

    2011-01-01

    The serotonin 2A(5-HT2A) receptor has been implicated in several neurological conditions and potent 5-HT2A antagonists have therapeutic effects in the treatment of schizophrenia and depression.In this study,a potent novel 5-HT2A inhibitor 05245768 with a Ki value of (593.89±34.10) nmol/L was discovered by integrating a set of computational approaches and experiments(protein structure prediction,pharmacophore-based virtual screening,automated molecular docking and pharmacological bioassay).The 5-HT2A receptor showed a negatively charged binding pocket.The binding mode of compound 05245768 with 5-HT2A was obtained by GOLD docking procedure,which revealed the conserved interaction between protonated nitrogen in compound 05245768 and carboxylate group of D3.32 at the active site of 5-HT2A.

  9. iPSCs-based anti-aging therapies: Recent discoveries and future challenges.

    Science.gov (United States)

    Pareja-Galeano, Helios; Sanchis-Gomar, Fabián; Pérez, Laura M; Emanuele, Enzo; Lucia, Alejandro; Gálvez, Beatriz G; Gallardo, María Esther

    2016-05-01

    The main biological hallmarks of the aging process include stem cell exhaustion and cellular senescence. Consequently, research efforts to treat age-related diseases as well as anti-aging therapies in general have recently focused on potential 'reprogramming' regenerative therapies. These new approaches are based on induced pluripotent stem cells (iPSCs), including potential in vivo reprogramming for tissue repair. Another possibility is targeting pathways of cellular senescence, e.g., through modulation of p16INK4a signaling and especially inhibition of the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB). Here, we reviewed and discussed these recent developments together with their possible usefulness for future treatments against sarcopenia, a major age-related condition. PMID:26921478

  10. Discovery of Tyk2 inhibitors via the virtual site-directed fragment-based drug design.

    Science.gov (United States)

    Jang, Woo Dae; Kim, Jun-Tae; Son, Hoon Young; Park, Seung Yeon; Cho, Young Sik; Koo, Tae-sung; Lee, Hyuk; Kang, Nam Sook

    2015-09-15

    In this study, we synthesized compound 12 with potent Tyk2 inhibitory activity from FBDD study and carried out a cell-based assay for Tyk2/STAT3 signaling activation upon IFNα5 stimulation. Compound 12 completely suppressed the IFNα5-mediated Tyk2/STAT3 signaling pathway as well as the basal levels of pSTAT3. Stimulation with IFNα/β leads to the tyrosine phosphorylation of the JAK1 and Tyk2 receptor-associated kinases with subsequent STATs activation, transmitting signals from the cell surface receptor to the nucleus. In conclusion, the potency of compound 12 to interrupt the signal transmission of Tyk2/STAT3 appeared to be equivalent or superior to that of the reference compound. PMID:26231159

  11. Polymerase chain reaction-based gene removal from plasmids

    Directory of Open Access Journals (Sweden)

    Vishnu Vardhan Krishnamurthy

    2015-09-01

    Full Text Available This data article contains supplementary figures and methods to the research article entitled, “Multiplex gene removal by two-step polymerase chain reactions” (Krishnamurthy et al., Anal. Biochem., 2015, doi:http://dx.doi.org/10.1016/j.ab.2015.03.033, which presents a restriction-enzyme free method to remove multiple DNA segments from plasmids. Restriction-free cloning methods have dramatically improved the flexibility and speed of genetic manipulation compared to conventional assays based on restriction enzyme digestion (Lale and Valla, 2014. DNA Cloning and Assembly Methods, vol. 1116. Here, we show the basic scheme and characterize the success rate for single and multiplex gene removal from plasmids. In addition, we optimize experimental conditions, including the amount of template, multiple primers mixing, and buffers for DpnI treatment, used in the one-pot reaction for multiplex gene removal.

  12. New approaches to drug discovery and development: a mechanism-based approach to pharmaceutical research and its application to BNP7787, a novel chemoprotective agent.

    Science.gov (United States)

    Hausheer, Frederick H; Kochat, Harry; Parker, Aulma R; Ding, Daoyuan; Yao, Shije; Hamilton, Susan E; Petluru, Pavankumar N; Leverett, Betsy D; Bain, Stacey H; Saxe, Jeffrey D

    2003-07-01

    Any approach applied to drug discovery and development by the medical community and pharmaceutical industry has a direct impact on the future availability of improved, novel, and curative therapies for patients with cancer. By definition, drug discovery is a complex learning process whereby research efforts are directed toward uncovering and assimilating new knowledge to create and develop a drug for the purpose of providing benefit to a defined patient population. Accordingly, a highly desirable technology or approach to drug discovery should facilitate both effective learning and the application of newly discovered observations that can be exploited for therapeutic benefit. However, some believe that drug discovery is largely accomplished by serendipity and therefore appropriately addressed by screening a large number of compounds. Clearly, this approach has not generated an abundance of new drugs for cancer patients and suggests that a tangibly different approach in drug discovery is warranted. We employ an alternative approach to drug discovery, which is based on the elucidation and exploitation of biological, pharmacological, and biochemical mechanisms that have not been previously recognized or fully understood. Mechanism-based drug discovery involves the combined application of physics-based computer simulations and laboratory experimentation. There is increasing evidence that agreement between simulations based on the laws of physics and experimental observations results in a higher probability that such observations are more accurate and better understood as compared with either approach used alone. Physics-based computer simulation applied to drug discovery is now considered by experts in the field to be one of the ultimate methodologies for drug discovery. However, the ability to perform truly comprehensive physics-based molecular simulations remains limited by several factors, including the enormous computer-processing power that is required to perform

  13. Use of Service Middleware Based on ECHO with CSW for Discovery and Registry of MODIS Data

    Institute of Scientific and Technical Information of China (English)

    CHEN Zeqiang; CHEN Nengcheng

    2010-01-01

    Nowadays, NASA is producing several terabytes Moderate Resolution Imaging Spectroradiometer (MODIS) data everyday; how to find the data with criteria, such as specific times, locations, and scales using an international standard becomes more and more important. In this paper, a service-oriented architecture for use of the integration Earth Observation System ClearingHOuse (ECHO) with the Open Geospatial Consortium (OGC) Catalogue Service-Web profile (CSW) is put forward. The architecture consists of three roles: a service requester (the user), a service provider (the ECHO metadata server), and a service broker (the GeoNetwork CSW and MODIS registry service middleware). The core component-MODIS registry service middleware includes three components: metadata fetcher, metadata transformer, and metadata register. The metadata fetcher is used to fetch metadata from ECHO metadata server; the metadata transformer is responsible for transform metadata from one form to another;the metadata register is in charge'of registering ISO19139-based metadata to CSW. A prototype system is designed and implemented by using the service middleware technology and a standard interface and protocol. The feasibility and the response time of registry and retrieval of MODIS data are evaluated by means of a realistic LPDAAC_ECS MODIS data center. The implementation of this prototype system and the experiment show that the architecture and method is feasible and effective.

  14. Discovery of YopE Inhibitors by Pharmacophore-Based Virtual Screening and Docking

    Science.gov (United States)

    Ozbuyukkaya, Gizem; Ozkirimli Olmez, Elif; Ulgen, Kutlu O.

    2013-01-01

    Gram-negative bacteria Yersinia secrete virulence factors that invade eukaryotic cells via type III secretion system. One particular virulence member, Yersinia outer protein E (YopE), targets Rho family of small GTPases by mimicking regulator GAP protein activity, and its secretion mainly induces cytoskeletal disruption and depolymerization of actin stress fibers within the host cell. In this work, potent drug-like inhibitors of YopE are investigated with virtual screening approaches. More than 500,000 unique small molecules from ZINC database were screened with a five-point pharmacophore, comprising three hydrogen acceptors, one hydrogen donor, and one ring, and derived from different salicylidene acylhydrazides. Binding modes and features of these molecules were investigated with a multistep molecular docking approach using Glide software. Virtual screening hits were further analyzed based on their docking score, chemical similarity, pharmacokinetic properties, and the key Arg144 interaction along with other active site residue interactions with the receptor. As a final outcome, a diverse set of ligands with inhibitory potential were proposed. PMID:25937949

  15. Discovery of novel GPVI receptor antagonists by structure-based repurposing.

    Directory of Open Access Journals (Sweden)

    Lewis Taylor

    Full Text Available Inappropriate platelet aggregation creates a cardiovascular risk that is largely managed with thienopyridines and aspirin. Although effective, these drugs carry risks of increased bleeding and drug 'resistance', underpinning a drive for new antiplatelet agents. To discover such drugs, one strategy is to identify a suitable druggable target and then find small molecules that modulate it. A good and unexploited target is the platelet collagen receptor, GPVI, which promotes thrombus formation. To identify inhibitors of GPVI that are safe and bioavailable, we docked a FDA-approved drug library into the GPVI collagen-binding site in silico. We now report that losartan and cinanserin inhibit GPVI-mediated platelet activation in a selective, competitive and dose-dependent manner. This mechanism of action likely underpins the cardioprotective effects of losartan that could not be ascribed to its antihypertensive effects. We have, therefore, identified small molecule inhibitors of GPVI-mediated platelet activation, and also demonstrated the utility of structure-based repurposing.

  16. Discovery of new inhibitors of Schistosoma mansoni PNP by pharmacophore-based virtual screening.

    Science.gov (United States)

    Postigo, Matheus P; Guido, Rafael V C; Oliva, Glaucius; Castilho, Marcelo S; da R Pitta, Ivan; de Albuquerque, Julianna F C; Andricopulo, Adriano D

    2010-09-27

    Schistosomiasis is considered the second most important tropical parasitic disease, with severe socioeconomic consequences for millions of people worldwide. Schistosoma mansoni , one of the causative agents of human schistosomiasis, is unable to synthesize purine nucleotides de novo, which makes the enzymes of the purine salvage pathway important targets for antischistosomal drug development. In the present work, we describe the development of a pharmacophore model for ligands of S. mansoni purine nucleoside phosphorylase (SmPNP) as well as a pharmacophore-based virtual screening approach, which resulted in the identification of three thioxothiazolidinones (1-3) with substantial in vitro inhibitory activity against SmPNP. Synthesis, biochemical evaluation, and structure-activity relationship investigations led to the successful development of a small set of thioxothiazolidinone derivatives harboring a novel chemical scaffold as new competitive inhibitors of SmPNP at the low-micromolar range. Seven compounds were identified with IC(50) values below 100 μM. The most potent inhibitors 7, 10, and 17 with IC(50) of 2, 18, and 38 μM, respectively, could represent new potential lead compounds for further development of the therapy of schistosomiasis. PMID:20695479

  17. Evidence, discovery and justification: the case of evidence-based medicine.

    Science.gov (United States)

    Gaeta, Rodolfo; Gentile, Nelida

    2016-08-01

    The purpose of this paper is to develop some thoughts on philosophical issues surrounding evidence-based medicine (EBM), especially related to its epistemological dimensions. After considering the scope of several philosophical concepts that are relevant to the discussion, and drawing some distinctions among different aspects of EBM, we evaluate the status of EBM and suggest that EBM is mainly a meta-methodology. Then, we outline an evaluation of the thesis that EBM is a 'new paradigm' in the practice of medicine. We argue that EBM does not seem to have arisen in the way Kuhn imagined paradigms to arise but as a conscious, deliberate proposal, more as programme than as a reality. Furthermore, there is something paradoxical about appealing to evidence or to the best evidence as a way of promoting a new paradigm. For the proposal seems to assume that there is something that by its own virtue is the best evidence for a given time. But this idea would have been rejected by Kuhn. If EBM involves a genuine new alternative in the field of medicine and shows a way in which the discipline will endure henceforth, this indicates that it is not what Kuhn once called a 'paradigm' and even, paradoxically, it is good evidence that scientific paradigms do not exist, at least in medicine. PMID:26200433

  18. Discovery of a chemosynthesis-based community in the western South Atlantic Ocean

    Science.gov (United States)

    Giongo, Adriana; Haag, Taiana; Simão, Taiz L. Lopes; Medina-Silva, Renata; Utz, Laura R. P.; Bogo, Maurício R.; Bonatto, Sandro L.; Zamberlan, Priscilla M.; Augustin, Adolpho H.; Lourega, Rogério V.; Rodrigues, Luiz F.; Sbrissa, Gesiane F.; Kowsmann, Renato O.; Freire, Antonio F. M.; Miller, Dennis J.; Viana, Adriano R.; Ketzer, João M. M.; Eizirik, Eduardo

    2016-06-01

    Chemosynthetic communities have been described from a variety of deep-sea environments across the world's oceans. They constitute very interesting biological systems in terms of their ecology, evolution and biogeography, and also given their potential as indicators of the presence and abundance of consistent hydrocarbon-based nutritional sources. Up to now such peculiar biotic assemblages have not been reported for the western South Atlantic Ocean, leaving this large region undocumented with respect to the presence, composition and history of such communities. Here we report on the presence of a chemosynthetic community off the coast of southern Brazil, in an area where high-levels of methane and the presence of gas hydrates have been detected. We performed metagenomic analyses of the microbial community present at this site, and also employed molecular approaches to identify components of its benthic fauna. We conducted phylogenetic analyses comparing the components of this assemblage to those found elsewhere in the world, which allowed a historical assessment of the structure and dynamics of these systems. Our results revealed that the microbial community at this site is quite diverse, and contains many components that are very closely related to lineages previously sampled in ecologically similar environments across the globe. Anaerobic methanotrophic (ANME) archaeal groups were found to be very abundant at this site, suggesting that methane is indeed an important source of nutrition for this community. In addition, we document the presence at this site of a vestimentiferan siboglinid polychaete and the bivalve Acharax sp., both of which are typical components of deep-sea chemosynthetic communities. The remarkable similarity in biotic composition between this area and other deep-sea communities across the world supports the interpretation that these assemblages are historically connected across the global oceans, undergoing colonization from distant sites and

  19. Structure-Based Discovery of Novel Cyclophilin A Inhibitors for the Treatment of Hepatitis C Virus Infections.

    Science.gov (United States)

    Yang, Suhui; K R, Jyothi; Lim, Sangbin; Choi, Tae Gyu; Kim, Jin-Hwan; Akter, Salima; Jang, Miran; Ahn, Hyun-Jong; Kim, Hee-Young; Windisch, Marc P; Khadka, Daulat B; Zhao, Chao; Jin, Yifeng; Kang, Insug; Ha, Joohun; Oh, Byung-Chul; Kim, Meehyein; Kim, Sung Soo; Cho, Won-Jea

    2015-12-24

    Hepatitis C virus (HCV) is a major cause of end-stage liver disease. Direct-acting antivirals (DAAs), including inhibitors of nonstructural proteins (NS3/4A protease, NS5A, and NS5B polymerase), represent key components of anti-HCV treatment, but these are associated with increased drug resistance and toxicity. Thus, the development of host-targeted antiviral agents, such as cyclophilin A inhibitors, is an alternative approach for more effective, selective, and safer treatment. Starting with the discovery of a bis-amide derivative 5 through virtual screening, the lead compound 25 was developed using molecular modeling-based design and systematic exploration of the structure-activity relationship. The lead 25 lacked cytotoxicity, had potent anti-HCV activity, and showed selective and high binding affinity for CypA. Unlike cyclosporin A, 25 lacked immunosuppressive effects, successfully inhibited the HCV replication, restored host immune responses without acute toxicity in vitro and in vivo, and exhibited a high synergistic effect in combination with other drugs. These findings suggest that the bis-amides have significant potential to extend the arsenal of HCV therapeutics. PMID:26613291

  20. Discovery and validation of an INflammatory PROtein-driven GAstric cancer Signature (INPROGAS) using antibody microarray-based oncoproteomics

    Science.gov (United States)

    Puig-Costa, Manuel; Codina-Cazador, Antonio; Cortés-Pastoret, Elisabet; Oliveras-Ferraros, Cristina; Cufí, Sílvia; Flaquer, Sílvia; Llopis-Puigmarti, Francesca; Pujol-Amado, Eulalia; Corominas-Faja, Bruna; Cuyàs, Elisabet; Ortiz, Rosa; Lopez-Bonet, Eugeni; Queralt, Bernardo; Guardeño, Raquel; Martin-Castillo, Begoña; Roig, Josep; Joven, Jorge; Menendez, Javier A.

    2014-01-01

    This study aimed to improve gastric cancer (GC) diagnosis by identifying and validating an INflammatory PROtein-driven GAstric cancer Signature (hereafter INPROGAS) using low-cost affinity proteomics. The detection of 120 cytokines, 43 angiogenic factors, 41 growth factors, 40 inflammatory factors and 10 metalloproteinases was performed using commercially available human antibody microarray-based arrays. We identified 21 inflammation-related proteins (INPROGAS) with significant differences in expression between GC tissues and normal gastric mucosa in a discovery cohort of matched pairs (n=10) of tumor/normal gastric tissues. Ingenuity pathway analysis confirmed the “inflammatory response”, “cellular movement” and “immune cell trafficking” as the most overrepresented biofunctions within INPROGAS. Using an expanded independent validation cohort (n = 22), INPROGAS classified gastric samples as “GC” or “non-GC” with a sensitivity of 82% (95% CI 59-94) and a specificity of 73% (95% CI 49-89). The positive predictive value and negative predictive value in this validation cohort were 75% (95% CI 53-90) and 80% (95% CI 56-94), respectively. The positive predictive value and negative predictive value in this validation cohort were 75% (95% CI 53-90) and 80% (95% CI 56-94), respectively. Antibody microarray analyses of the GC-associated inflammatory proteome identified a 21-protein INPROGAS that accurately discriminated GC from noncancerous gastric mucosa. PMID:24722433

  1. Exploring the potential benefits of false discovery rates for region-based testing of association with rare genetic variation

    Directory of Open Access Journals (Sweden)

    ChangJiang eXu

    2014-01-01

    Full Text Available When analyzing the data arising from exome or whole-genome sequencing studies, window-based tests, i.e. tests that jointly analyze all genetic data in a small genomic region, are very popular. However, power is known to be quite low for finding associations with phenotypes using these tests, and hence a variety of analytic strategies may be employed to potentially improve power. Using sequencing data from all of chromosome 3 in an interim release of data on 2,432 individuals from the UK10K project, we simulated phenotypes associated with rare genetic variation, and used the results to explore the window-based test power, and to ask two specific questions. Firstly, we asked whether there could be substantial benefits associated with incorporating information from external annotation on the genetic variants, and secondly we asked whether the false discovery rate (FDRs would be a useful metric for assessing significance. Although, as expected, there are benefits to using additional information (such as annotation when it is associated with causality, we confirmed the general pattern of low sensitivity and power for window-based tests. At least for our chosen example, even when power is high to detect some associations, many of the regions containing causal variants cannot be detected, despite using lax significance thresholds and optimal analytic methods. Furthermore, our estimated FDR values tended to be much smaller than the true FDRs. Long-range correlations between variants—due to linkage disequilibrium—likely explains some of this bias. A more sophisticated approach to using the annotation information may help the power, but many causal variants of realistic effect sizes may simply be undetectable, at least with this sample size. Perhaps annotation information could assist in distinguishing windows containing causal variants from windows that are merely correlated with causal variants.

  2. Gene expression-based biomarkers for Anopheles gambiae age grading.

    Science.gov (United States)

    Wang, Mei-Hui; Marinotti, Osvaldo; Zhong, Daibin; James, Anthony A; Walker, Edward; Guda, Tom; Kweka, Eliningaya J; Githure, John; Yan, Guiyun

    2013-01-01

    Information on population age structure of mosquitoes under natural conditions is fundamental to the understanding of vectorial capacity and crucial for assessing the impact of vector control measures on malaria transmission. Transcriptional profiling has been proposed as a method for predicting mosquito age for Aedes and Anopheles mosquitoes, however, whether this new method is adequate for natural conditions is unknown. This study tests the applicability of transcriptional profiling for age-grading of Anopheles gambiae, the most important malaria vector in Africa. The transcript abundance of two An. gambiae genes, AGAP009551 and AGAP011615, was measured during aging under laboratory and field conditions in three mosquito strains. Age-dependent monotonic changes in transcript levels were observed in all strains evaluated. These genes were validated as age-grading biomarkers using the mark, release and recapture (MRR) method. The MRR method determined a good correspondence between actual and predicted age, and thus demonstrated the value of age classifications derived from the transcriptional profiling of these two genes. The technique was used to establish the age structure of mosquito populations from two malaria-endemic areas in western Kenya. The population age structure determined by the transcriptional profiling method was consistent with that based on mosquito parity. This study demonstrates that the transcription profiling method based on two genes is valuable for age determination of natural mosquitoes, providing a new approach for determining a key life history trait of malaria vectors. PMID:23936017

  3. Gene expression-based biomarkers for Anopheles gambiae age grading.

    Directory of Open Access Journals (Sweden)

    Mei-Hui Wang

    Full Text Available Information on population age structure of mosquitoes under natural conditions is fundamental to the understanding of vectorial capacity and crucial for assessing the impact of vector control measures on malaria transmission. Transcriptional profiling has been proposed as a method for predicting mosquito age for Aedes and Anopheles mosquitoes, however, whether this new method is adequate for natural conditions is unknown. This study tests the applicability of transcriptional profiling for age-grading of Anopheles gambiae, the most important malaria vector in Africa. The transcript abundance of two An. gambiae genes, AGAP009551 and AGAP011615, was measured during aging under laboratory and field conditions in three mosquito strains. Age-dependent monotonic changes in transcript levels were observed in all strains evaluated. These genes were validated as age-grading biomarkers using the mark, release and recapture (MRR method. The MRR method determined a good correspondence between actual and predicted age, and thus demonstrated the value of age classifications derived from the transcriptional profiling of these two genes. The technique was used to establish the age structure of mosquito populations from two malaria-endemic areas in western Kenya. The population age structure determined by the transcriptional profiling method was consistent with that based on mosquito parity. This study demonstrates that the transcription profiling method based on two genes is valuable for age determination of natural mosquitoes, providing a new approach for determining a key life history trait of malaria vectors.

  4. Versatile Supramolecular Gene Vector Based on Host-Guest Interaction.

    Science.gov (United States)

    Liu, Jia; Hennink, Wim E; van Steenbergen, Mies J; Zhuo, Renxi; Jiang, Xulin

    2016-04-20

    It is a great challenge to arrange multiple functional components into one gene vector system to overcome the extra- and intracellular obstacles for gene therapy. In this study, we developed a supramolecular approach for constructing a versatile gene delivery system composed of adamantyl-terminated functional polymers and a β-cyclodextrin based polymer. Adamantyl-functionalized low molecular weight PEIs (PEI-Ad) and PEG (Ad-PEG) as well as poly(β-cyclodextrin) (PCD) were synthesized by one-step chemical reactions. The supramolecular inclusion complex formed from PCD to assemble LMW PEI-Ad4 via host-guest interactions can condense plasmid DNA to form nanopolyplexes by electrostatic interactions. The supramolecular polyplexes can be further PEGylated with Ad-PEG to form inclusion complexes, which showed increased salt and serum stability. In vitro experiments revealed that these supramolecular assembly polyplexes had good cytocompatibility and showed high transfection activity close to that of the commercial ExGen 500 at high dose of DNA. Also, the supramolecular vector system exhibited about 60% silencing efficiency as a siRNA vector. Thus, a versatile effective supramolecular gene vector based on host-guest complexes was fabricated with good cytocompatbility and transfection activity. PMID:27019340

  5. Nanoparticle-based targeted gene therapy for lung cancer

    Science.gov (United States)

    Lee, Hung-Yen; Mohammed, Kamal A; Nasreen, Najmunnisa

    2016-01-01

    Despite striking insights on lung cancer progression, and cutting-edge therapeutic approaches the survival of patients with lung cancer, remains poor. In recent years, targeted gene therapy with nanoparticles is one of the most rapidly evolving and extensive areas of research for lung cancer. The major goal of targeted gene therapy is to bring forward a safe and efficient treatment to cancer patients via specifically targeting and deterring cancer cells in the body. To achieve high therapeutic efficacy of gene delivery, various carriers have been engineered and developed to provide protection to the genetic materials and efficient delivery to targeted cancer cells. Nanoparticles play an important role in the area of drug delivery and have been widely applied in cancer treatments for the purposes of controlled release and cancer cell targeting. Nanoparticles composed of artificial polymers, proteins, polysaccharides and lipids have been developed for the delivery of therapeutic deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sequences to target cancer. In addition, the effectiveness of cancer targeting has been enhanced by surface modification or conjugation with biomolecules on the surface of nanoparticles. In this review article we provide an overview on the latest developments in nanoparticle-based targeted gene therapy for lung cancers. Firstly, we outline the conventional therapies and discuss strategies for targeted gene therapy using nanoparticles. Secondly, we provide the most representative and recent researches in lung cancers including malignant pleural mesothelioma, mainly focusing on the application of Polymeric, Lipid-based, and Metal-based nanoparticles. Finally, we discuss current achievements and future challenges. PMID:27294004

  6. Precision multidimensional assay for high-throughput microRNA drug discovery

    OpenAIRE

    Haefliger, Benjamin; Prochazka, Laura; Angelici, Bartolomeo; Benenson, Yaakov

    2016-01-01

    Development of drug discovery assays that combine high content with throughput is challenging. Information-processing gene networks can address this challenge by integrating multiple potential targets of drug candidates' activities into a small number of informative readouts, reporting simultaneously on specific and non-specific effects. Here we show a family of networks implementing this concept in a cell-based drug discovery assay for miRNA drug targets. The networks comprise multiple modul...

  7. Integrating Discovery-Based Research Experiences into the Undergraduate STEM Curriculum: A Convocation Report from the National Academies of Sciences, Engineering and Medicine

    Science.gov (United States)

    Guertin, L. A.; Ambos, E. L.; Brenner, K.; Asher, P. M.; Ryan, J. G.

    2015-12-01

    New possibilities and challenges to providing and scaling up opportunities for large numbers of undergraduates to engage in discovery-based research and related activities reflect both the evidence base and the current systemic infrastructure of higher education. The National Research Council hosted a Convocation in May 2015 on this very topic, inspired by the 2012 PCAST report "Engage to Excel," which urged the STEM education community and funding agencies to "advocate and provide support for replacing standard laboratory courses with discovery-based research courses." The Convocation report "Integrating Discovery-Based Research into the Undergraduate STEM Curriculum" on which this session is based explores a number of critical issues: Is our current knowledge base robust enough to recommend best practices? Is offering such experiences actually beneficial for all undergraduates? What institutional changes will be required to make such opportunities available to large numbers of students? Can such programs drive institutional change? How can we manage the cost/benefit parameters of such programs? Exploring these important and connected issues is critical for allowing undergraduates to participate in meaningful and relevant research through their coursework, for faculty and administrators to examine and document the evidence for their impact, and institutions to identify variations in what works at different types of colleges and universities.

  8. A Gene Selection Approach based on Clustering for Classification Tasks in Colon Cancer

    Directory of Open Access Journals (Sweden)

    José Antonio CASTELLANOS GARZÓN

    2016-06-01

    Full Text Available Gene selection (GS is an important research area in the analysis of DNA-microarray data, since it involves gene discovery meaningful for a particular target annotation or able to discriminate expression profiles of samples coming from different populations. In this context, a wide number of filter methods have been proposed in the literature to identify subsets of relevant genes in accordance with prefixed targets. Despite the fact that there is a wide number of proposals, the complexity imposed by this problem (GS remains a challenge. Hence, this paper proposes a novel approach for gene selection by using cluster techniques and filter methods on the found groupings to achieve informative gene subsets. As a result of applying our methodology to Colon cancer data, we have identified the best informative gene subset between several one subsets. According to the above, the reached results have proven the reliability of the approach given in this paper.

  9. Developing a Data Discovery Tool for Interdisciplinary Science: Leveraging a Web-based Mapping Application and Geosemantic Searching

    Science.gov (United States)

    Albeke, S. E.; Perkins, D. G.; Ewers, S. L.; Ewers, B. E.; Holbrook, W. S.; Miller, S. N.

    2015-12-01

    The sharing of data and results is paramount for advancing scientific research. The Wyoming Center for Environmental Hydrology and Geophysics (WyCEHG) is a multidisciplinary group that is driving scientific breakthroughs to help manage water resources in the Western United States. WyCEHG is mandated by the National Science Foundation (NSF) to share their data. However, the infrastructure from which to share such diverse, complex and massive amounts of data did not exist within the University of Wyoming. We developed an innovative framework to meet the data organization, sharing, and discovery requirements of WyCEHG by integrating both open and closed source software, embedded metadata tags, semantic web technologies, and a web-mapping application. The infrastructure uses a Relational Database Management System as the foundation, providing a versatile platform to store, organize, and query myriad datasets, taking advantage of both structured and unstructured formats. Detailed metadata are fundamental to the utility of datasets. We tag data with Uniform Resource Identifiers (URI's) to specify concepts with formal descriptions (i.e. semantic ontologies), thus allowing users the ability to search metadata based on the intended context rather than conventional keyword searches. Additionally, WyCEHG data are geographically referenced. Using the ArcGIS API for Javascript, we developed a web mapping application leveraging database-linked spatial data services, providing a means to visualize and spatially query available data in an intuitive map environment. Using server-side scripting (PHP), the mapping application, in conjunction with semantic search modules, dynamically communicates with the database and file system, providing access to available datasets. Our approach provides a flexible, comprehensive infrastructure from which to store and serve WyCEHG's highly diverse research-based data. This framework has not only allowed WyCEHG to meet its data stewardship

  10. Development of gene diagnosis for diabetes and cholecystitis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Base sequence analysis of CCKAR gene (a gene of A-type receptor for cholecystokinin) from OLETF rat, a model rat for insulin-independent diabetes was made based on the base sequence of wild CCKAR gene, which had been clarified in the previous year. From the pancreas of OLETF rat, DNA was extracted and transduced into λphage after fragmentation to construct the gene library of OLETF. Then, λphage DNA clone bound with labelled cDNA of CCKAR gene was analyzed and the gene structure was compared with that of the wild gene. It was demonstrated that CCKAR gene of OLETF had a deletion (6800 b.p.) ranging from the promoter region to the Exon 2, suggesting that CCKAR gene is not functional in OLETF rat. The whole sequence of this mutant gene was registered into Japan DNA Bank (D 50610). Then, F2 offspring rats were obtained through crossing OLETF (female) and F344 (male) and the time course-changes in the blood glucose level after glucose loading were compared among them. The blood glucose level after glucose loading was significantly higher in the homo-mutant F2 (CCKAR,-/-) as well as the parent OLETF rat than hetero-mutant F2 (CCKARm-/+) or the wild rat (CCKAR,+/+). This suggests that CCKAR gene might be involved in the control of blood glucose level and an alteration of the expression level or the functions of CCKAR gene might affect the blood glucose level. (M.N.)

  11. Discovery Mondays

    CERN Multimedia

    2003-01-01

    Many people don't realise quite how much is going on at CERN. Would you like to gain first-hand knowledge of CERN's scientific and technological activities and their many applications? Try out some experiments for yourself, or pick the brains of the people in charge? If so, then the «Lundis Découverte» or Discovery Mondays, will be right up your street. Starting on May 5th, on every first Monday of the month you will be introduced to a different facet of the Laboratory. CERN staff, non-scientists, and members of the general public, everyone is welcome. So tell your friends and neighbours and make sure you don't miss this opportunity to satisfy your curiosity and enjoy yourself at the same time. You won't have to listen to a lecture, as the idea is to have open exchange with the expert in question and for each subject to be illustrated with experiments and demonstrations. There's no need to book, as Microcosm, CERN's interactive museum, will be open non-stop from 7.30 p.m. to 9 p.m. On the first Discovery M...

  12. Identification of crucial genes in intracranial aneurysm based on weighted gene coexpression network analysis.

    Science.gov (United States)

    Zheng, X; Xue, C; Luo, G; Hu, Y; Luo, W; Sun, X

    2015-05-01

    The rupture of intracranial aneurysm (IA) is the leading cause for devastating subarachnoid hemorrhage. This study aimed to investigate genes related to IA and potential diagnosis targets. Two data sets (GSE15629 and GSE54083) were downloaded from Gene Expression Omnibus database. GSE15629 contained eight RI (ruptured IA), six UI (unruptured IA) and five control IA samples. GSE54083 included 8 RI, 5 UI and 10 superficial temporal artery samples. In total, 452 differentially expressed genes (DEGs) between RI and control, and 570 DEGs between UI and control, were identified. Protein-protein interaction networks for two kinds of DEGs related to RI and UI were constructed, respectively. Module networks were searched for DEGs related to RI or UI based on WGCNA (weighted gene coexpression network analysis). In the significant modules, FOS, CCL2, COL4A2 and CXCL5 were screened as crucial nodes with high degrees. Among them, FOS and CCL2 were enriched in immune response and COL4A2 was involved in the ECM (extracellular matrix) pathway, whereas CXCL5 was related to cytokine-cytokine receptor pathway. Taken together, FOS, CCL2, COL4A2 and CXCL5 might participate in the pathogenesis of RI or UI, and could serve as potential diagnosis targets. PMID:25721208

  13. A Framework for Ontology-Based Service Discovery and Composition%一种基于语义的服务组装框架

    Institute of Scientific and Technical Information of China (English)

    李伟平; 褚伟杰; 高福亮; 刘利; 童缙

    2009-01-01

    Currently a large number of web services as well as other kinds of services such as EJBs,COM,and even Java Classes are made available to the general public.Facilitating the SOA based system development by leveraging such kinds of services becomes a challenge.A framework for service repository,ontology based service discovery and service composition is put forward.The service repository can maintain the web services,EJBs,and Java Classes with the functions such as service registration,publishing,discovery,matching,versioning,and monitoring.The details of service description are analyzed.A domain ontology for Procurement,Selling,and Inventory is also given.Based on the domain ontology and the service repository,the semantic enhanced service composition algorithm is discussed.

  14. A fisheye viewer for microarray-based gene expression data

    Directory of Open Access Journals (Sweden)

    Munson Ethan V

    2006-10-01

    Full Text Available Abstract Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site http://polaris.imt.uwm.edu:7777/fisheye/. The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table.

  15. A modular positive feedback-based gene amplifier

    Directory of Open Access Journals (Sweden)

    Bhalerao Kaustubh D

    2010-02-01

    Full Text Available Abstract Background Positive feedback is a common mechanism used in the regulation of many gene circuits as it can amplify the response to inducers and also generate binary outputs and hysteresis. In the context of electrical circuit design, positive feedback is often considered in the design of amplifiers. Similar approaches, therefore, may be used for the design of amplifiers in synthetic gene circuits with applications, for example, in cell-based sensors. Results We developed a modular positive feedback circuit that can function as a genetic signal amplifier, heightening the sensitivity to inducer signals as well as increasing maximum expression levels without the need for an external cofactor. The design utilizes a constitutively active, autoinducer-independent variant of the quorum-sensing regulator LuxR. We experimentally tested the ability of the positive feedback module to separately amplify the output of a one-component tetracycline sensor and a two-component aspartate sensor. In each case, the positive feedback module amplified the response to the respective inducers, both with regards to the dynamic range and sensitivity. Conclusions The advantage of our design is that the actual feedback mechanism depends only on a single gene and does not require any other modulation. Furthermore, this circuit can amplify any transcriptional signal, not just one encoded within the circuit or tuned by an external inducer. As our design is modular, it can potentially be used as a component in the design of more complex synthetic gene circuits.

  16. PGMA-based gene carriers with lipid molecules.

    Science.gov (United States)

    Xu, Chen; Yu, Bingran; Hu, Hao; Nizam, Muhammad Naeem; Yuan, Wei; Ma, Jie; Xu, Fu-Jian

    2016-08-19

    Lipids, as the greatest constituent in cell membranes, have been widely used for biomedical applications because of their excellent biological properties. The introduction of membrane lipid molecules into gene vectors would embody greater biocompatibility, cellular uptake and transfection efficiency. In this work, one flexible strategy for readily conjugating lipid molecules with polycations was proposed based on atom transfer radical polymerization to produce a series of cholesterol (CHO)- and phosphatidylinositol (PI)-terminated ethanolamine-functionalized poly(glycidyl methacrylate)s, namely CHO-PGEAs and PI-PGEAs, as effective gene carriers. CHO-PGEAs and PI-PGEAs truly demonstrated much better transfection performances compared to linear ethanolamine-functionalized poly(glycidyl methacrylate) (denoted as BUCT-PGEA) counterparts and traditional standard branched polythylenimine (PEI, 25 kDa). In addition, the good antitumor effects of CHO-PGEA and PI-PGEA were confirmed with suppressor tumor gene p53 systems in vitro and in vivo. The present work could provide a new strategy to develop effective cationic conjugation of lipid molecules for gene therapy. PMID:27374783

  17. Opportunistic Adaptation Knowledge Discovery

    OpenAIRE

    Badra, Fadi; Cordier, Amélie; Lieber, Jean

    2009-01-01

    The original publication is available at www.springerlink.com International audience Adaptation has long been considered as the Achilles' heel of case-based reasoning since it requires some domain-specific knowledge that is difficult to acquire. In this paper, two strategies are combined in order to reduce the knowledge engineering cost induced by the adaptation knowledge (CA) acquisition task: CA is learned from the case base by the means of knowledge discovery techniques, and the CA a...

  18. Data Mining and Knowledge Discovery in Gaia survey: GUASOM, an analysis tool based on Self Organizing Maps

    Science.gov (United States)

    Manteiga, Minia; Dafonte, Jose Carlos; Ulla, Ana; Alvarez, Marco Antonio; Garabato, Daniel; Fustes, Diego

    2015-08-01

    Gaia, the astrometric cornerstone mission of the European Space Agency (ESA) was successfully launched in December 2013. In June 2014 Gaia started its scientific operations phase scanning the sky with the different instruments on board. Gaia was designed to measure positions, parallaxes and motions to the microarcsec level, thus providing the first highly accurate 6-D map of about a thousand million objects of the Milky Way. A vast community of astronomers are looking forward to the delivery of the promise of the first non-biased survey of the entire sky down to magnitude 20.We present GUASOM a data mining tool designed for knowledge discovery in large astronomical spectrophotometric archives, that was developed in the framework of Gaia DPAC (Data Processing and Analysis Consocium). Our tool is based on a type of unsupervised learning Artificial Neural Networks named Self-organizing maps (SOMs).SOMs are used to organize the information in clusters of objects, as homogeneously as possible according to their spectral energy distributions, and to project them onto a 2D grid where the data structure can be visualized. Each cluster has a representative, called prototype which is a virtual pattern that better represents or resembles the set of input patterns belonging to such a cluster. Prototypes make easier the task of determining the physical nature of the objects populating each cluster. Our algorithm has been tested on SDSS observations and theoretical spectral libraries covering a wide sample of astronomical objects.Self-organizing maps permit the grouping and visualization of big amount of data for which there is no a priori knowledge..GUASOM provides a useful toolbox for data visualization and crossmatching. To this effect, we have used SIMBAD catalog to perform astrometric crossmatching with a sample of SDSS classification outliers, seeking for identifications.

  19. A Resampling Based Clustering Algorithm for Replicated Gene Expression Data.

    Science.gov (United States)

    Li, Han; Li, Chun; Hu, Jie; Fan, Xiaodan

    2015-01-01

    In gene expression data analysis, clustering is a fruitful exploratory technique to reveal the underlying molecular mechanism by identifying groups of co-expressed genes. To reduce the noise, usually multiple experimental replicates are performed. An integrative analysis of the full replicate data, instead of reducing the data to the mean profile, carries the promise of yielding more precise and robust clusters. In this paper, we propose a novel resampling based clustering algorithm for genes with replicated expression measurements. Assuming those replicates are exchangeable, we formulate the problem in the bootstrap framework, and aim to infer the consensus clustering based on the bootstrap samples of replicates. In our approach, we adopt the mixed effect model to accommodate the heterogeneous variances and implement a quasi-MCMC algorithm to conduct statistical inference. Experiments demonstrate that by taking advantage of the full replicate data, our algorithm produces more reliable clusters and has robust performance in diverse scenarios, especially when the data is subject to multiple sources of variance. PMID:26671802

  20. Information dimension analysis of bacterial essential and nonessential genes based on chaos game representation

    International Nuclear Information System (INIS)

    Essential genes are indispensable for the survival of an organism. Investigating features associated with gene essentiality is fundamental to the prediction and identification of the essential genes. Selecting features associated with gene essentiality is fundamental to predict essential genes with computational techniques. We use fractal theory to make comparative analysis of essential and nonessential genes in bacteria. The information dimensions of essential genes and nonessential genes available in the DEG database for 27 bacteria are calculated based on their gene chaos game representations (CGRs). It is found that weak positive linear correlation exists between information dimension and gene length. Moreover, for genes of similar length, the average information dimension of essential genes is larger than that of nonessential genes. This indicates that essential genes show less regularity and higher complexity than nonessential genes. Our results show that for bacterium with a similar number of essential genes and nonessential genes, the CGR information dimension is helpful for the classification of essential genes and nonessential genes. Therefore, the gene CGR information dimension is very probably a useful gene feature for a genetic algorithm predicting essential genes. (paper)