WorldWideScience

Sample records for gene discovery application

  1. Ribozymes: applications to functional analysis and gene discovery.

    Science.gov (United States)

    Shiota, Maki; Sano, Masayuki; Miyagishi, Makoto; Taira, Kazunari

    2004-08-01

    Ribozymes are catalytic RNA molecules that cleave RNAs with high specificity. Since the discovery of these non-protein enzymes, the rapidly developing field of ribozymes has been of particular interest because of the potential utility of ribozymes as tools for reversed genetics. However, despite extensive efforts, the activity of ribozymes in vivo has not usually been high enough to achieve the desirable biological effects. Now, by the use of RNA polymerase III (pol III) promoters, the ribozyme activity in cells has been successfully improved by developing efficient transport systems for the transcripts to the cytoplasm. In addition, it is possible to cleave a specific target RNA in cells by using an allosterically controllable ribozyme or an RNA-protein hybrid ribozyme. These ribozymes are potentially applicable to molecular gene therapy and efficient gene discovery systems. Furthermore, the developed pol III expression system is applicable to the expression of small interfering RNAs (siRNAs). The advantage of such ribozymes over siRNAs is the high specificity of the ribozyme that would not cause interferon responses.

  2. Strategic Applications of Gene Expression: From Drug Discovery/Development to Bedside

    OpenAIRE

    Bai, Jane P. F.; Alekseyenko, Alexander V.; Statnikov, Alexander; Wang, I-Ming; Wong, Peggy H.

    2013-01-01

    Gene expression is useful for identifying the molecular signature of a disease and for correlating a pharmacodynamic marker with the dose-dependent cellular responses to exposure of a drug. Gene expression offers utility to guide drug discovery by illustrating engagement of the desired cellular pathways/networks, as well as avoidance of acting on the toxicological pathways. Successful employment of gene-expression signatures in the later stages of drug development depends on their linkage to ...

  3. Gene discovery using mutagen-induced polymorphisms and deep sequencing: application to plant disease resistance.

    Science.gov (United States)

    Zhu, Ying; Mang, Hyung-gon; Sun, Qi; Qian, Jun; Hipps, Ashley; Hua, Jian

    2012-09-01

    Next-generation sequencing technologies are accelerating gene discovery by combining multiple steps of mapping and cloning used in the traditional map-based approach into one step using DNA sequence polymorphisms existing between two different accessions/strains/backgrounds of the same species. The existing next-generation sequencing method, like the traditional one, requires the use of a segregating population from a cross of a mutant organism in one accession with a wild-type (WT) organism in a different accession. It therefore could potentially be limited by modification of mutant phenotypes in different accessions and/or by the lengthy process required to construct a particular mapping parent in a second accession. Here we present mapping and cloning of an enhancer mutation with next-generation sequencing on bulked segregants in the same accession using sequence polymorphisms induced by a chemical mutagen. This method complements the conventional cloning approach and makes forward genetics more feasible and powerful in molecularly dissecting biological processes in any organisms. The pipeline developed in this study can be used to clone causal genes in background of single mutants or higher order of mutants and in species with or without sequence information on multiple accessions.

  4. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    Directory of Open Access Journals (Sweden)

    Oelofse Dean

    2010-04-01

    redundant clones together and illustrated that the SSHscreen plots are a useful tool for choosing anonymous clones for sequencing, since redundant clones cluster together on the enrichment ratio plots. Conclusions We developed the SSHscreen-SSHdb software pipeline, which greatly facilitates gene discovery using suppression subtractive hybridization by improving the selection of clones for sequencing after screening the library on a small number of microarrays. Annotation of the sequence information and collaboration was further enhanced through a web-based SSHdb database, and we illustrated this through identification of drought responsive genes from cowpea, which can now be investigated in gene function studies. SSH is a popular and powerful gene discovery tool, and therefore this pipeline will have application for gene discovery in any biological system, particularly non-model organisms. SSHscreen 2.0.1 and a link to SSHdb are available from http://microarray.up.ac.za/SSHscreen.

  5. Antibiotic resistance gene discovery in food-producing animals.

    Science.gov (United States)

    Allen, Heather K

    2014-06-01

    Numerous environmental reservoirs contribute to the widespread antibiotic resistance problem in human pathogens. One environmental reservoir of particular importance is the intestinal bacteria of food-producing animals. In this review I examine recent discoveries of antibiotic resistance genes in agricultural animals. Two types of antibiotic resistance gene discoveries will be discussed: the use of classic microbiological and molecular techniques, such as culturing and PCR, to identify known genes not previously reported in animals; and the application of high-throughput technologies, such as metagenomics, to identify novel genes and gene transfer mechanisms. These discoveries confirm that antibiotics should be limited to prudent uses.

  6. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases.

    Directory of Open Access Journals (Sweden)

    Rocio Chavez-Alvarez

    Full Text Available DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.

  7. Independent Gene Discovery and Testing

    Science.gov (United States)

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  8. Biomedical Application of Knowledge Discovery

    Science.gov (United States)

    Koike, A.

    With rapid progress in biomedical fields, the knowledge accumulated in scientific papers has increased significantly. Most of these papers draw only a fragmental conclusion from the viewpoint of scientific facts, so discovery of hidden knowledge or hypothesis generation by leveraging this fragmental information has come into the limelight and more expectations on the system constructions to assist them has been paid. To respond to these expectations, we have developed a system called BioTermNet (http://btn.ontology.ims.u-tokyo.ac.jp:8081/) to make a conceptual network by connecting conceptual relationships (fragmental information) explicitly described in papers and explore the hidden relationships in the conceptual network. The conceptual relationships are extracted by hybrid methods of information extraction and information-retrieval techniques. This system has a potential for wide application. After the validation of system performance, we take up some topics of conceptual network-based analysis and refer to other applications in the future prospects section.

  9. Human brain evolution: from gene discovery to phenotype discovery.

    Science.gov (United States)

    Preuss, Todd M

    2012-06-26

    The rise of comparative genomics and related technologies has added important new dimensions to the study of human evolution. Our knowledge of the genes that underwent expression changes or were targets of positive selection in human evolution is rapidly increasing, as is our knowledge of gene duplications, translocations, and deletions. It is now clear that the genetic differences between humans and chimpanzees are far more extensive than previously thought; their genomes are not 98% or 99% identical. Despite the rapid growth in our understanding of the evolution of the human genome, our understanding of the relationship between genetic changes and phenotypic changes is tenuous. This is true even for the most intensively studied gene, FOXP2, which underwent positive selection in the human terminal lineage and is thought to have played an important role in the evolution of human speech and language. In part, the difficulty of connecting genes to phenotypes reflects our generally poor knowledge of human phenotypic specializations, as well as the difficulty of interpreting the consequences of genetic changes in species that are not amenable to invasive research. On the positive side, investigations of FOXP2, along with genomewide surveys of gene-expression changes and selection-driven sequence changes, offer the opportunity for "phenotype discovery," providing clues to human phenotypic specializations that were previously unsuspected. What is more, at least some of the specializations that have been proposed are amenable to testing with noninvasive experimental techniques appropriate for the study of humans and apes.

  10. Beegle: from literature mining to disease-gene discovery.

    Science.gov (United States)

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

  11. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  12. Association Rule Discovery and Its Applications

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Data mining, i.e. , mining knowledge from large amounts of data, is a demanding field since huge amounts of data have been collected in various applications. The collected data far exceed peoples ability to analyze it. Thus, some new and efficient methods are needed to discover knowledge from large database. Association rule discovery is an important problem in knowledge discovery and data mining.The association mining task consists of identifying the frequent item sets and then forming conditional implication rule among them. In this paper, we describe and summarize recent work on association rule discovery, offer a new method to association rule mining and point out that association rule discovery can be applied in spatial data mining. It is useful to discover knowledge from remote sensing and geographical information system.``

  13. The Analysis of Multiple Genome Comparisons in Genus Escherichia and Its Application to the Discovery of Uncharacterised Metabolic Genes in Uropathogenic Escherichia coli CFT073

    Directory of Open Access Journals (Sweden)

    William A. Bryant

    2009-01-01

    Full Text Available A survey of a complete gene synteny comparison has been carried out between twenty fully sequenced strains from the genus Escherichia with the aim of finding yet uncharacterised genes implicated in the metabolism of uropathogenic strains of E. coli (UPEC. Several sets of adjacent colinear genes have been identified which are present in all four UPEC included in this study (CFT073, F11, UTI89, and 536, annotated with putative metabolic functions, but are not found in any other strains considered. An operon closely homologous to that encoding the L-sorbose degradation pathway in Klebsiella pneumoniae has been identified in E. coli CFT073; this operon is present in all of the UPEC considered, but only in 7 of the other 16 strains. The operon's function has been confirmed by cloning the genes into E. coli DH5α and testing for growth on L-sorbose. The functional genomic approach combining in silico and in vitro work presented here can be used as a basis for the discovery of other uncharacterised genes contributing to bacterial survival in specific environments.

  14. Gene discovery of modular diterpene metabolism in nonmodel systems.

    Science.gov (United States)

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M S; Chiang, Angela; Sandhu, Harpreet K; Madilao, Lina L; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-06-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization.

  15. Non-syndromic retinal ciliopathies: translating gene discovery into therapy

    NARCIS (Netherlands)

    Estrada-Cuzcano, A.; Roepman, R.; Cremers, F.P.; Hollander, A.I. den; Mans, D.A.

    2012-01-01

    Homozygosity mapping and exome sequencing have accelerated the discovery of gene mutations and modifier alleles implicated in inherited retinal degeneration in humans. To date, 158 genes have been found to be mutated in individuals with retinal dystrophies. Approximately one-third of the gene defect

  16. Fast-track applications: The potential for direct delivery of proteins and nucleic acids to plant cells for the discovery of gene function

    Directory of Open Access Journals (Sweden)

    Roberts Michael R

    2005-12-01

    Full Text Available Abstract In animal systems, several methods exist for the direct delivery of nucleic acids and proteins into cells for functional analysis. Until recently, these methods have not been applied to plant systems. Now, however, several preliminary reports suggest that both nucleic acids and proteins can also be delivered into plant cells by very simple, direct application. This promises to open the way for high-throughput screening for gene function in a range of plant species.

  17. The Matchmaker Exchange: a platform for rare disease gene discovery.

    Science.gov (United States)

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.

  18. Bioinformatics Assisted Gene Discovery and Annotation of Human Genome

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.

  19. Crowdsourcing the nodulation gene network discovery environment.

    Science.gov (United States)

    Li, Yupeng; Jackson, Scott A

    2016-05-26

    The Legumes (Fabaceae) are an economically and ecologically important group of plant species with the conspicuous capacity for symbiotic nitrogen fixation in root nodules, specialized plant organs containing symbiotic microbes. With the aim of understanding the underlying molecular mechanisms leading to nodulation, many efforts are underway to identify nodulation-related genes and determine how these genes interact with each other. In order to accurately and efficiently reconstruct nodulation gene network, a crowdsourcing platform, CrowdNodNet, was created. The platform implements the jQuery and vis.js JavaScript libraries, so that users are able to interactively visualize and edit the gene network, and easily access the information about the network, e.g. gene lists, gene interactions and gene functional annotations. In addition, all the gene information is written on MediaWiki pages, enabling users to edit and contribute to the network curation. Utilizing the continuously updated, collaboratively written, and community-reviewed Wikipedia model, the platform could, in a short time, become a comprehensive knowledge base of nodulation-related pathways. The platform could also be used for other biological processes, and thus has great potential for integrating and advancing our understanding of the functional genomics and systems biology of any process for any species. The platform is available at http://crowd.bioops.info/ , and the source code can be openly accessed at https://github.com/bioops/crowdnodnet under MIT License.

  20. SNP marker discovery in koala TLR genes.

    Directory of Open Access Journals (Sweden)

    Jian Cui

    Full Text Available Toll-like receptors (TLRs play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.

  1. Method and Application of Comprehensive Knowledge Discovery

    Institute of Scientific and Technical Information of China (English)

    SHA Zongyao; BIAN Fuling

    2003-01-01

    This paper proposes the principle of comprehensive knowledge discovery. Unlike most of the current knowledge discovery methods, the comprehensive knowledge discovery considers both the spatial relations and attributes of spatial entities or objects. We introduce the theory of spatial knowledge expression system and some concepts including comprehensive knowledge discovery and spatial union information table(SUIT). In theory, SUIT records all information contained in the studied objects, but in reality, because of the complexity and varieties of spatial relations,only those factors of interest to us are selected. In order to find out the comprehensive knowledge from spatial databases, an efficient comprehensive knowledge discovery algorithm called recycled algorithm (RAR) is suggested.

  2. Accelerators for Discovery Science and Security applications

    Energy Technology Data Exchange (ETDEWEB)

    Todd, A.M.M., E-mail: alan_todd@mail.aesys.net; Bluem, H.P.; Jarvis, J.D.; Park, J.H.; Rathke, J.W.; Schultheiss, T.J.

    2015-05-01

    Several Advanced Energy Systems (AES) accelerator projects that span applications in Discovery Science and Security are described. The design and performance of the IR and THz free electron laser (FEL) at the Fritz-Haber-Institut der Max-Planck-Gesellschaft in Berlin that is now an operating user facility for physical chemistry research in molecular and cluster spectroscopy as well as surface science, is highlighted. The device was designed to meet challenging specifications, including a final energy adjustable in the range of 15–50 MeV, low longitudinal emittance (<50 keV-psec) and transverse emittance (<20 π mm-mrad), at more than 200 pC bunch charge with a micropulse repetition rate of 1 GHz and a macropulse length of up to 15 μs. Secondly, we will describe an ongoing effort to develop an ultrafast electron diffraction (UED) source that is scheduled for completion in 2015 with prototype testing taking place at the Brookhaven National Laboratory (BNL) Accelerator Test Facility (ATF). This tabletop X-band system will find application in time-resolved chemical imaging and as a resource for drug–cell interaction analysis. A third active area at AES is accelerators for security applications where we will cover some top-level aspects of THz and X-ray systems that are under development and in testing for stand-off and portal detection.

  3. Rice mutant resources for gene discovery

    NARCIS (Netherlands)

    Hirochika, H.; Guiderdoni, E.; An, G.; Hsing, Y.I.; Eun, M.Y.; Han, C.D.; Upadhyaya, N.; Ramachandran, S.; Zhang, Q.F.; Pereira, A.B.; Sundaresan, V.; Leung, H.

    2004-01-01

    With the completion of genomic sequencing of rice, rice has been firmly established as a model organism for both basic and applied research. The next challenge is to uncover the functions of genes predicted by sequence analysis. Considering the amount of effort and the diversity of disciplines requi

  4. Psychiatric gene discoveries shape evidence on ADHD's biology

    NARCIS (Netherlands)

    Thapar, A.; Martin, J.; Mick, E.; Arias Vasquez, A.; Langley, K.; Scherer, S.W.; Schachar, R.; Crosbie, J.; Williams, N.; Franke, B.; Elia, J.; Glessner, J.; Hakonarson, H.; Owen, M.J.; Faraone, S.V; O'Donovan, M.C.; Holmans, P.

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenes

  5. Application of machine learning in SNP discovery

    Directory of Open Access Journals (Sweden)

    Cregan Perry B

    2006-01-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNP constitute more than 90% of the genetic variation, and hence can account for most trait differences among individuals in a given species. Polymorphism detection software PolyBayes and PolyPhred give high false positive SNP predictions even with stringent parameter values. We developed a machine learning (ML method to augment PolyBayes to improve its prediction accuracy. ML methods have also been successfully applied to other bioinformatics problems in predicting genes, promoters, transcription factor binding sites and protein structures. Results The ML program C4.5 was applied to a set of features in order to build a SNP classifier from training data based on human expert decisions (True/False. The training data were 27,275 candidate SNP generated by sequencing 1973 STS (sequence tag sites (12 Mb in both directions from 6 diverse homozygous soybean cultivars and PolyBayes analysis. Test data of 18,390 candidate SNP were generated similarly from 1359 additional STS (8 Mb. SNP from both sets were classified by experts. After training the ML classifier, it agreed with the experts on 97.3% of test data compared with 7.8% agreement between PolyBayes and experts. The PolyBayes positive predictive values (PPV (i.e., fraction of candidate SNP being real were 7.8% for all predictions and 16.7% for those with 100% posterior probability of being real. Using ML improved the PPV to 84.8%, a 5- to 10-fold increase. While both ML and PolyBayes produced a similar number of true positives, the ML program generated only 249 false positives as compared to 16,955 for PolyBayes. The complexity of the soybean genome may have contributed to high false SNP predictions by PolyBayes and hence results may differ for other genomes. Conclusion A machine learning (ML method was developed as a supplementary feature to the polymorphism detection software for improving prediction accuracies. The results from this study

  6. Gene discovery in the Entamoeba invadens genome.

    Science.gov (United States)

    Wang, Zheng; Samuelson, John; Clark, C Graham; Eichinger, Daniel; Paul, Jaishree; Van Dellen, Katrina; Hall, Neil; Anderson, Iain; Loftus, Brendan

    2003-06-01

    Entamoeba invadens, a parasite of reptiles, is a model for the study of encystation by the human enteric pathogen Entamoeba histolytica, because E. invadens form cysts in axenic culture. With approximately 0.5-fold sequence coverage of the genome, we were able to get insights into E. invadens gene and genome features. Overall, the E. invadens genome displays many of the features that are emerging from ongoing genome sequencing efforts in E. histolytica. At the nucleotide level the E. invadens genome has on average 60% sequence identity with that of E. histolytica. The presence of introns in E. invadens was predicted with similar consensus (GTTTGT em leader A/TAG) sequences to those identified in E. histolytica and Entamoeba dispar. Sequences highly repeated in the genome of E. histolytica (rRNAs, tRNAs, CXXC-rich proteins, and Leu-rich repeat proteins) were found to be highly repeated in the E. invadens genome. Numerous proteins homologous to those implicated in amoebic virulence, (Gal/GalNAc lectins, amoebapores, and cysteine proteinases) and drug resistance (p-glycoproteins) were identified. Homologs of proteins involved in cell cycle, vesicular trafficking and signal transduction were identified, which may be involved in en/excystation and cell growth of E. invadens. Finally, multiple copies of a number of E. invadens genes coding for predicted enzymes involved in core metabolism and the targets of anti-amoebic drugs were identified.

  7. Discovery of pinoresinol reductase genes in sphingomonads.

    Science.gov (United States)

    Fukuhara, Y; Kamimura, N; Nakajima, M; Hishiyama, S; Hara, H; Kasai, D; Tsuji, Y; Narita-Yamada, S; Nakamura, S; Katano, Y; Fujita, N; Katayama, Y; Fukuda, M; Kajita, S; Masai, E

    2013-01-10

    Bacterial genes for the degradation of major dilignols produced in lignifying xylem are expected to be useful tools for the structural modification of lignin in plants. For this purpose, we isolated pinZ involved in the conversion of pinoresinol from Sphingobium sp. strain SYK-6. pinZ showed 43-77% identity at amino acid level with bacterial NmrA-like proteins of unknown function, a subgroup of atypical short chain dehydrogenases/reductases, but revealed only 15-21% identity with plant pinoresinol/lariciresinol reductases. PinZ completely converted racemic pinoresinol to lariciresinol, showing a specific activity of 46±3 U/mg in the presence of NADPH at 30°C. In contrast, the activity for lariciresinol was negligible. This substrate preference is similar to a pinoresinol reductase, AtPrR1, of Arabidopsis thaliana; however, the specific activity of PinZ toward (±)-pinoresinol was significantly higher than that of AtPrR1. The role of pinZ and a pinZ ortholog of Novosphingobium aromaticivorans DSM 12444 were also characterized.

  8. Spark, an application based on Serendipitous Knowledge Discovery.

    Science.gov (United States)

    Workman, T Elizabeth; Fiszman, Marcelo; Cairelli, Michael J; Nahl, Diane; Rindflesch, Thomas C

    2016-04-01

    Findings from information-seeking behavior research can inform application development. In this report we provide a system description of Spark, an application based on findings from Serendipitous Knowledge Discovery studies and data structures known as semantic predications. Background information and the previously published IF-SKD model (outlining Serendipitous Knowledge Discovery in online environments) illustrate the potential use of information-seeking behavior in application design. A detailed overview of the Spark system illustrates how methodologies in design and retrieval functionality enable production of semantic predication graphs tailored to evoke Serendipitous Knowledge Discovery in users.

  9. Too New for Textbooks: The Biotechnology Discoveries & Applications Guidebook

    Science.gov (United States)

    Loftin, Madelene; Lamb, Neil E.

    2013-01-01

    The "Biotechnology Discoveries and Applications" guidebook aims to provide teachers with an overview of the recent advances in genetics and biotechnology, allowing them to share these findings with their students. The annual guidebook introduces a wealth of modern genomic discoveries and provides teachers with tools to integrate exciting…

  10. Integrated analysis of gene expression by association rules discovery

    Directory of Open Access Journals (Sweden)

    Carazo Jose M

    2006-02-01

    Full Text Available Abstract Background Microarray technology is generating huge amounts of data about the expression level of thousands of genes, or even whole genomes, across different experimental conditions. To extract biological knowledge, and to fully understand such datasets, it is essential to include external biological information about genes and gene products to the analysis of expression data. However, most of the current approaches to analyze microarray datasets are mainly focused on the analysis of experimental data, and external biological information is incorporated as a posterior process. Results In this study we present a method for the integrative analysis of microarray data based on the Association Rules Discovery data mining technique. The approach integrates gene annotations and expression data to discover intrinsic associations among both data sources based on co-occurrence patterns. We applied the proposed methodology to the analysis of gene expression datasets in which genes were annotated with metabolic pathways, transcriptional regulators and Gene Ontology categories. Automatically extracted associations revealed significant relationships among these gene attributes and expression patterns, where many of them are clearly supported by recently reported work. Conclusion The integration of external biological information and gene expression data can provide insights about the biological processes associated to gene expression programs. In this paper we show that the proposed methodology is able to integrate multiple gene annotations and expression data in the same analytic framework and extract meaningful associations among heterogeneous sources of data. An implementation of the method is included in the Engene software package.

  11. Literature mining for the discovery of hidden connections between drugs, genes and diseases.

    Science.gov (United States)

    Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

    2010-09-23

    The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.

  12. Mouse models for the discovery of colorectal cancer driver genes.

    Science.gov (United States)

    Clark, Christopher R; Starr, Timothy K

    2016-01-14

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC.

  13. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    Directory of Open Access Journals (Sweden)

    Richardson Annette C

    2008-07-01

    Full Text Available Abstract Background Kiwifruit (Actinidia spp. are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs. Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons. Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases and pathways (terpenoid biosynthesis is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia.

  14. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  15. Automated discovery of functional generality of human gene expression programs.

    Directory of Open Access Journals (Sweden)

    Georg K Gerber

    2007-08-01

    Full Text Available An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-kappaB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal

  16. Genome-enabled Discovery of Carbon Sequestration Genes

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Tschaplinski, Timothy J [ORNL; Kalluri, Udaya C [ORNL; Yin, Tongming [ORNL; Yang, Xiaohan [ORNL; Zhang, Xinye [ORNL; Engle, Nancy L [ORNL; Ranjan, Priya [ORNL; Basu, Manojit M [ORNL; Gunter, Lee E [ORNL; Jawdy, Sara [ORNL; Martin, Madhavi Z [ORNL; Campbell, Alina S [ORNL; DiFazio, Stephen P [ORNL; Davis, John M [University of Florida; Hinchee, Maud [ORNL; Pinnacchio, Christa [U.S. Department of Energy, Joint Genome Institute; Meilan, R [Purdue University; Busov, V. [Michigan Technological University; Strauss, S [Oregon State University

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  17. Inflammatory bowel disease gene discovery. CRADA final report

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-09-09

    The ultimate goal of this project is to identify the human gene(s) responsible for the disorder known as IBD. The work was planned in two phases. The desired products resulting from Phase 1 were BAC clone(s) containing the genetic marker(s) identified by gene/Networks, Inc. as potentially linked to IBD, plasmid subclones of those BAC(s), and new genetic markers developed from these plasmid subclones. The newly developed markers would be genotyped by gene/Networks, Inc. to ascertain evidence for linkage or non-linkage of IBD to this region. If non-linkage was indicated, the project would move to investigation of other candidate chromosomal regions. Where linkage was indicated, the project would move to Phase 2, in which a physical map of the candidate region(s) would be developed. The products of this phase would be contig(s) of BAC clones in the region exhibiting linkage to IBD, as well as plasmic subclones of the BACs and further genetic marker development. There would also be continued genotyping with new polymorphic markers during this phase. It was anticipated that clones identified and developed during these two phases would provide the physical resources for eventual disease gene discovery.

  18. Gene-disease relationship discovery based on model-driven data integration and database view definition

    National Research Council Canada - National Science Library

    Yilmaz, S; Jonveaux, P; Bicep, C; Pierron, L; Smaïl-Tabbone, M; Devignes, M.D

    2009-01-01

    .... orthologous or interacting genes. These definitions guide data modelling in our database approach for gene-disease relationship discovery and are expressed as views which ultimately lead to the retrieval of documented sets of candidate genes...

  19. Psychiatric gene discoveries shape evidence on ADHD's biology

    Science.gov (United States)

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10−4) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  20. Non-syndromic retinal ciliopathies: translating gene discovery into therapy.

    Science.gov (United States)

    Estrada-Cuzcano, Alejandro; Roepman, Ronald; Cremers, Frans P M; den Hollander, Anneke I; Mans, Dorus A

    2012-10-15

    Homozygosity mapping and exome sequencing have accelerated the discovery of gene mutations and modifier alleles implicated in inherited retinal degeneration in humans. To date, 158 genes have been found to be mutated in individuals with retinal dystrophies. Approximately one-third of the gene defects underlying retinal degeneration affect the structure and/or function of the 'connecting cilium' in photoreceptors. This structure corresponds to the transition zone of a prototypic cilium, a region with increasing relevance for ciliary homeostasis. The connecting cilium connects the inner and outer segments of the photoreceptor, mediating bi-directional transport of phototransducing proteins required for vision. In fact, the outer segment, connecting cilium and associated basal body, forms a highly specialized sensory cilium, fully dedicated to photoreception and subsequent signal transduction to the brain. At least 21 genes that encode ciliary proteins are implicated in non-syndromic retinal dystrophies such as cone dystrophy, cone-rod dystrophy, Leber congenital amaurosis (LCA), macular degeneration or retinitis pigmentosa (RP). The generation and characterization of vertebrate retinal ciliopathy animal models have revealed insights into the molecular disease mechanism which are indispensable for the development and evaluation of therapeutic strategies. Gene augmentation therapy has proven to be safe and successful in restoring long-term sight in mice, dogs and humans suffering from LCA or RP. Here, we present a comprehensive overview of the genes, mutations and modifier alleles involved in non-syndromic retinal ciliopathies, review the progress in dissecting the associated retinal disease mechanisms and evaluate gene augmentation approaches to antagonize retinal degeneration in these ciliopathies.

  1. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    Energy Technology Data Exchange (ETDEWEB)

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral

  2. Susceptibility gene discovery for common metabolic and endocrine traits.

    Science.gov (United States)

    McCarthy, M I

    2002-02-01

    Almost all major causes of ill-health and premature death in human societies worldwide - including cancer, cardiovascular disease, diabetes and many infectious diseases - are, at least in part, genetically determined. Typically, risk of succumbing to one of these illnesses is thought to depend on both the individual repertoire of variation within a number of key susceptibility genes and the history of exposure to relevant environmental factors. For many of these conditions, the molecular basis of disease pathogenesis remains obscure. This represents a major obstacle to development of improved, rational strategies for disease treatment, prevention and eradication. It is easy therefore to appreciate the importance attached to efforts to deliver more comprehensive understanding of the molecular basis of disease pathogenesis. Nor is it hard to understand that identification of major susceptibility genes should highlight those components of molecular machinery that are critical for the preservation of normal health. The benefits promised are great, but progress to gene identification in multifactorial traits has been rather disappointing to date. Why is this? This review aims to answer this question by describing current and future approaches to gene discovery in multifactorial traits. The examples quoted will mostly relate to type 2 diabetes, but the issues and approaches are generic, and apply equally to other multifactorial traits in the endocrine and metabolic arena - type 1 diabetes; obesity; hyperlipidaemia; autoimmune thyroid disease; polycystic ovarian syndrome - and beyond.

  3. Gene expression endophenotypes: a novel approach for gene discovery in Alzheimer's disease

    Directory of Open Access Journals (Sweden)

    Ertekin-Taner Nilüfer

    2011-05-01

    Full Text Available Abstract Uncovering the underlying genetic component of any disease is key to the understanding of its pathophysiology and may open new avenues for development of therapeutic strategies and biomarkers. In the past several years, there has been an explosion of genome-wide association studies (GWAS resulting in the discovery of novel candidate genes conferring risk for complex diseases, including neurodegenerative diseases. Despite this success, there still remains a substantial genetic component for many complex traits and conditions that is unexplained by the GWAS findings. Additionally, in many cases, the mechanism of action of the newly discovered disease risk variants is not inherently obvious. Furthermore, a genetic region with multiple genes may be identified via GWAS, making it difficult to discern the true disease risk gene. Several alternative approaches are proposed to overcome these potential shortcomings of GWAS, including the use of quantitative, biologically relevant phenotypes. Gene expression levels represent an important class of endophenotypes. Genetic linkage and association studies that utilize gene expression levels as endophenotypes determined that the expression levels of many genes are under genetic influence. This led to the postulate that there may exist many genetic variants that confer disease risk via modifying gene expression levels. Results from the handful of genetic studies which assess gene expression level endophenotypes in conjunction with disease risk suggest that this combined phenotype approach may both increase the power for gene discovery and lead to an enhanced understanding of their mode of action. This review summarizes the evidence in support of gene expression levels as promising endophenotypes in the discovery and characterization of novel candidate genes for complex diseases, which may also represent a novel approach in the genetic studies of Alzheimer's and other neurodegenerative diseases.

  4. Application of chemical proteomics to biomarker discovery in cardiac research

    NARCIS (Netherlands)

    Aye, T.T.

    2010-01-01

    This thesis is primarily focused on (i.) exploring chemical probes to increase sensitivity and specificity for the investigation of low abundant cardiac proteins applicable to both biology and biomarker discovery, and (ii.) exploiting different aspects of mass spectrometry-based proteomics for build

  5. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    Science.gov (United States)

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.

  6. Application of large-scale sequencing to marker discovery in plants

    Indian Academy of Sciences (India)

    Robert J Henry; Mark Edwards; Daniel L E Waters; S Gopala Krishnan; Peter Bundock; Timothy R Sexton; Ardashir K Masouleh; Catherine J Nock; Julie Pattemore

    2012-11-01

    Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the poly-morphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.

  7. Application of large-scale sequencing to marker discovery in plants.

    Science.gov (United States)

    Henry, Robert J; Edwards, Mark; Waters, Daniel L E; Gopala Krishnan, S; Bundock, Peter; Sexton, Timothy R; Masouleh, Ardashir K; Nock, Catherine J; Pattemore, Julie

    2012-11-01

    Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the polymorphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.

  8. Canonical correlation analysis for gene-based pleiotropy discovery.

    Directory of Open Access Journals (Sweden)

    Jose A Seoane

    2014-10-01

    Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

  9. Applications of Pattern Recognition in Drug Discovery

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    This is a brief account of the plenary talk given to the meeting of the Chinese Society of Chemical Science and Technology held in Oxford on 6 October 2001. The talk covered the application of pattern recognition techniques to discover molecules which will bind to the binding sites of proteins. Three situations were considered: the structure of the protein being unknown; the structure known but the binding site unknown; and finally, and this is the most important case for the future, both the structure and nature of the target site available in atomic detail. For this case we have developed a massively distributed computer program using a screensaver which now involves over one million personal computers, including over a thousand in China. The project will involve the screening of 3.5 billion small molecules against 16 protein targets, all of which are implicated in the process of cancer.

  10. Applications of chemogenomic library screening in drug discovery.

    Science.gov (United States)

    Jones, Lyn H; Bunnage, Mark E

    2017-01-20

    The allure of phenotypic screening, combined with the industry preference for target-based approaches, has prompted the development of innovative chemical biology technologies that facilitate the identification of new therapeutic targets for accelerated drug discovery. A chemogenomic library is a collection of selective small-molecule pharmacological agents, and a hit from such a set in a phenotypic screen suggests that the annotated target or targets of that pharmacological agent may be involved in perturbing the observable phenotype. In this Review, we describe opportunities for chemogenomic screening to considerably expedite the conversion of phenotypic screening projects into target-based drug discovery approaches. Other applications are explored, including drug repositioning, predictive toxicology and the discovery of novel pharmacological modalities.

  11. Targeted SNP discovery in Atlantic salmon (Salmo salar genes using a 3'UTR-primed SNP detection approach

    Directory of Open Access Journals (Sweden)

    Høyheim Bjørn

    2010-12-01

    Full Text Available Abstract Background Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar has applied a global (genome-wide approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs. We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.

  12. Risk genes for schizophrenia: translational opportunities for drug discovery.

    Science.gov (United States)

    Winchester, Catherine L; Pratt, Judith A; Morris, Brian J

    2014-07-01

    Despite intensive research over many years, the treatment of schizophrenia remains a major health issue. Current and emerging treatments for schizophrenia are based upon the classical dopamine and glutamate hypotheses of disease. Existing first and second generation antipsychotic drugs based upon the dopamine hypothesis are limited by their inability to treat all symptom domains and their undesirable side effect profiles. Third generation drugs based upon the glutamate hypothesis of disease are currently under evaluation but are more likely to be used as add on treatments. Hence there is a large unmet clinical need. A major challenge in neuropsychiatric disease research is the relatively limited knowledge of disease mechanisms. However, as our understanding of the genetic causes of the disease evolves, novel strategies for the development of improved therapeutic agents will become apparent. In this review we consider the current status of knowledge of the genetic basis of schizophrenia, including methods for identifying genetic variants associated with the disorder and how they impact on gene function. Although the genetic architecture of schizophrenia is complex, some targets amenable to pharmacological intervention can be discerned. We conclude that many challenges lie ahead but the stratification of patients according to biobehavioural constructs that cross existing disease classifications but with common genetic and neurobiological bases, offer opportunities for new approaches to effective drug discovery.

  13. The discovery of the microphthalmia locus and its gene, Mitf.

    Science.gov (United States)

    Arnheiter, Heinz

    2010-12-01

    The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was first cloned from a chance transgene insertion at the microphthalmia locus. And although Mitf was found to encode a member of a well-known transcription factor family, its analysis might still be in its infancy had Mitf not turned out to be of crucial importance for the physiology and pathology of many distinct organs, including eye, ear, immune system, bone, and skin, and in particular for melanoma. In fact, near seven decades of Mitf research have led to many insights about development, function, degeneration, and malignancies of a number of specific cell types, and it is hoped that these insights will one day lead to therapies benefitting those afflicted with diseases originating in these cell types.

  14. Validation of Context Based Service Discovery Protocol for Ubiquitous Applications

    Directory of Open Access Journals (Sweden)

    Anandi Giridharan

    2012-11-01

    Full Text Available Service Discovery Protocol (SDP is important in ubiquitous applications, where a large number of devicesand software components collaborate unobtrusively and provide numerous services without userintervention. Existing service discovery schemes use a service matching process in order to offer services ofinterest to the users. Potentially, the context information of the users and surrounding environment can beused to improve the quality of service matching. We propose a C-IOB (Context- Information, Observationand Belief based service discovery model, which deals with the above challenges by processing the contextinformation and by formulating the beliefs based on the basis of observations. With these formulated beliefsthe required services will be provided to the users. In this work, we present an approach for automatedvalidation of C-IOB based service discovery model in a typical ubiquitous museum environment, where theexternal behavior of the system can be predicted and compared to a model of expected behavior from theoriginal requirements. Formal specification using SDL (Specification and Description Language basedsystem has been used to conduct verification and validation of the system. The purpose of this framework isto provide a formal basis for their performance evaluation and behavioral study of the SDP.

  15. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  16. A bioinformatics knowledge discovery in text application for grid computing.

    Science.gov (United States)

    Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco

    2009-06-16

    A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of

  17. Privacy-aware knowledge discovery novel applications and new techniques

    CERN Document Server

    Bonchi, Francesco

    2010-01-01

    Covering research at the frontier of this field, Privacy-Aware Knowledge Discovery: Novel Applications and New Techniques presents state-of-the-art privacy-preserving data mining techniques for application domains, such as medicine and social networks, that face the increasing heterogeneity and complexity of new forms of data. Renowned authorities from prominent organizations not only cover well-established results-they also explore complex domains where privacy issues are generally clear and well defined, but the solutions are still preliminary and in continuous development. Divided into seve

  18. Gene Discovery of Modular Diterpene Metabolism in Nonmodel Systems1[W][OA

    Science.gov (United States)

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M.S.; Chiang, Angela; Sandhu, Harpreet K.; Madilao, Lina L.; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-01-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  19. Next-generation diagnostics and disease-gene discovery with the Exomiser.

    Science.gov (United States)

    Smedley, Damian; Jacobsen, Julius O B; Jäger, Marten; Köhler, Sebastian; Holtgrewe, Manuel; Schubach, Max; Siragusa, Enrico; Zemojtel, Tomasz; Buske, Orion J; Washington, Nicole L; Bone, William P; Haendel, Melissa A; Robinson, Peter N

    2015-12-01

    Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.

  20. Application of CRISPR/Cas9 for biomedical discoveries.

    Science.gov (United States)

    Riordan, Sean M; Heruth, Daniel P; Zhang, Li Q; Ye, Shui Qing

    2015-01-01

    The Clustered Regions of Interspersed Palindromic Repeats-Cas9 (CRISPR/Cas9), a viral defense system found in bacteria and archaea, has emerged as a tour de force genome editing tool. The CRISPR/Cas9 system is much easier to customize and optimize because the site selection for DNA cleavage is guided by a short sequence of RNA rather than an engineered protein as in the systems of zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN), and meganucleases. Although it still suffers from some off-target effects, the CRISPR/Cas9 system has been broadly and successfully applied for biomedical discoveries in a number of areas. In this review, we present a brief history and development of the CRISPR system and focus on the application of this genome editing technology for biomedical discoveries. We then present concise concluding remarks and future directions for this fast moving field.

  1. Applications and limitations of in silico models in drug discovery.

    Science.gov (United States)

    Sacan, Ahmet; Ekins, Sean; Kortagere, Sandhya

    2012-01-01

    Drug discovery in the late twentieth and early twenty-first century has witnessed a myriad of changes that were adopted to predict whether a compound is likely to be successful, or conversely enable identification of molecules with liabilities as early as possible. These changes include integration of in silico strategies for lead design and optimization that perform complementary roles to that of the traditional in vitro and in vivo approaches. The in silico models are facilitated by the availability of large datasets associated with high-throughput screening, bioinformatics algorithms to mine and annotate the data from a target perspective, and chemoinformatics methods to integrate chemistry methods into lead design process. This chapter highlights the applications of some of these methods and their limitations. We hope this serves as an introduction to in silico drug discovery.

  2. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    Directory of Open Access Journals (Sweden)

    Gasser Robin B

    2007-06-01

    Full Text Available Abstract Background Cholangiocarcinoma (CCA – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum than to free-living (Schmidtea mediterranea flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions

  3. Third Generation Sequencing Techniques and Applications to Drug Discovery

    Science.gov (United States)

    Ozsolak, Fatih

    2012-01-01

    Introduction There is an immediate need for functional and molecular studies to decipher differences between disease and “normal” settings to identify large quantities of validated targets with the highest therapeutic utilities. Furthermore, drug mechanism of action and biomarkers to predict drug efficacy and safety need to be identified for effective design of clinical trials, decreasing attrition rates, regulatory agency approval process and drug repositioning. By expanding the power of genetics and pharmacogenetics studies, next generation nucleic acid sequencing technologies have started to play an important role in all stages of drug discovery. Areas covered This article reviews the first and second generation sequencing technologies (SGSTs) and challenges they pose to biomedicine. The article then focuses on the emerging third generation sequencing technologies (TGSTs), their technological foundations and potential contributions to drug discovery. Expert Opinion Despite the scientific and commercial success of SGSTs, the goal of rapid, comprehensive and unbiased sequencing of nucleic acids has not been achieved. TGSTs promise to increase sequencing throughput and read lengths, decrease costs, run times and error rates, eliminate biases inherent in SGSTs, and offer capabilities beyond nucleic acid sequencing. Such changes will have positive impact in all sequencing applications to drug discovery. PMID:22468954

  4. SPARCoC: a new framework for molecular pattern discovery and cancer gene identification.

    Directory of Open Access Journals (Sweden)

    Shiqian Ma

    Full Text Available It is challenging to cluster cancer patients of a certain histopathological type into molecular subtypes of clinical importance and identify gene signatures directly relevant to the subtypes. Current clustering approaches have inherent limitations, which prevent them from gauging the subtle heterogeneity of the molecular subtypes. In this paper we present a new framework: SPARCoC (Sparse-CoClust, which is based on a novel Common-background and Sparse-foreground Decomposition (CSD model and the Maximum Block Improvement (MBI co-clustering technique. SPARCoC has clear advantages compared with widely-used alternative approaches: hierarchical clustering (Hclust and nonnegative matrix factorization (NMF. We apply SPARCoC to the study of lung adenocarcinoma (ADCA, an extremely heterogeneous histological type, and a significant challenge for molecular subtyping. For testing and verification, we use high quality gene expression profiling data of lung ADCA patients, and identify prognostic gene signatures which could cluster patients into subgroups that are significantly different in their overall survival (with p-values < 0.05. Our results are only based on gene expression profiling data analysis, without incorporating any other feature selection or clinical information; we are able to replicate our findings with completely independent datasets. SPARCoC is broadly applicable to large-scale genomic data to empower pattern discovery and cancer gene identification.

  5. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (danio rerio

    Directory of Open Access Journals (Sweden)

    Wang Rong-Lin

    2012-08-01

    individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions.

  6. Adeno-associated virus at 50: a golden anniversary of discovery, research, and gene therapy success--a personal perspective.

    Science.gov (United States)

    Hastie, Eric; Samulski, R Jude

    2015-05-01

    Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications.

  7. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach.

    Science.gov (United States)

    Patel, Sejal; Park, Min Tae M; Chakravarty, M Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease.

  8. Gene-disease relationship discovery based on model-driven data integration and database view definition.

    Science.gov (United States)

    Yilmaz, S; Jonveaux, P; Bicep, C; Pierron, L; Smaïl-Tabbone, M; Devignes, M D

    2009-01-15

    Computational methods are widely used to discover gene-disease relationships hidden in vast masses of available genomic and post-genomic data. In most current methods, a similarity measure is calculated between gene annotations and known disease genes or disease descriptions. However, more explicit gene-disease relationships are required for better insights into the molecular bases of diseases, especially for complex multi-gene diseases. Explicit relationships between genes and diseases are formulated as candidate gene definitions that may include intermediary genes, e.g. orthologous or interacting genes. These definitions guide data modelling in our database approach for gene-disease relationship discovery and are expressed as views which ultimately lead to the retrieval of documented sets of candidate genes. A system called ACGR (Approach for Candidate Gene Retrieval) has been implemented and tested with three case studies including a rare orphan gene disease.

  9. Proteomics and Its Application in Biomarker Discovery and Drug Development

    Institute of Scientific and Technical Information of China (English)

    He Qing-Yu; Chiu Jen-Fu

    2004-01-01

    Proteomics is a research field aiming to characterize molecular and cellular dynamics in protein expression and function on a global level. The introduction of proteomics has been greatly broadening our view and accelerating our path in various medical researches. The most significant advantage of proteomics is its ability to examine a whole proteome or sub-proteome in a single experiment so that the protein alterations corresponding to a pathological or biochemical condition at a given time can be considered in an integrated way. Proteomic technology has been extensively used to tackle a wide variety of medical subjects including biomarker discovery and drug development. By complement with other new technique advance in genomics and bioinformatics,proteomics has a great potential to make considerable contribution to biomarker identification and revolutionize drug development process. A brief overview of the proteomic technologies will be provided and the application of proteomics in biomarker discovery and drug development will be discussed using our current research projects as examples.

  10. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar;

    2004-01-01

    Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number...... of human genes are functionally well characterized. It is still unclear how many and which human genes respond to estrogen treatment. We propose a simple, economic, yet effective computational method to predict a subclass of estrogen responsive genes. Our method relies on the similarity of ERE frames...... across different promoters in the human genome. Matching ERE frames of a test set of 60 known estrogen responsive genes to the collection of over 18,000 human promoters, we obtained 604 candidate genes. Evaluating our result by comparison with the published microarray data and literature, we found...

  11. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    Science.gov (United States)

    Toxicogenomics, drug discovery, and pathologist.The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  12. Discovery and industrial applications of lytic polysaccharide mono-oxygenases.

    Science.gov (United States)

    Johansen, Katja S

    2016-02-01

    The recent discovery of copper-dependent lytic polysaccharide mono-oxygenases (LPMOs) has opened up a vast area of research covering several fields of application. The biotech company Novozymes A/S holds patents on the use of these enzymes for the conversion of steam-pre-treated plant residues such as straw to free sugars. These patents predate the correct classification of LPMOs and the striking synergistic effect of fungal LPMOs when combined with canonical cellulases was discovered when fractions of fungal secretomes were evaluated in industrially relevant enzyme performance assays. Today, LPMOs are a central component in the Cellic CTec enzyme products which are used in several large-scale plants for the industrial production of lignocellulosic ethanol. LPMOs are characterized by an N-terminal histidine residue which, together with an internal histidine and a tyrosine residue, co-ordinates a single copper atom in a so-called histidine brace. The mechanism by which oxygen binds to the reduced copper atom has been reported and the general mechanism of copper-oxygen-mediated activation of carbon is being investigated in the light of these discoveries. LPMOs are widespread in both the fungal and the bacterial kingdoms, although the range of action of these enzymes remains to be elucidated. However, based on the high abundance of LPMOs expressed by microbes involved in the decomposition of organic matter, the importance of LPMOs in the natural carbon-cycle is predicted to be significant. In addition, it has been suggested that LPMOs play a role in the pathology of infectious diseases such as cholera and to thus be relevant in the field of medicine. © 2016 Authors; published by Portland Press Limited.

  13. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  14. Applications of Fiberoptics-Based Nanosensors to Drug Discovery

    Science.gov (United States)

    Vo-Dinh, Tuan; Scaffidi, Jonathan; Gregas, Molly; Zhang, Yan; Seewaldt, Victoria

    2013-01-01

    Background Fiber-optic nanosensors are fabricated by heating and pulling optical fibers to yield sub-micron diameter tips, and have been used for in vitro analysis of individual living mammalian cells. Immobilization of bioreceptors (e.g., antibodies, peptides, DNA, etc) selective to target analyte molecules of interest provides molecular specificity. Excitation light can be launched into the fiber, and the resulting evanescent field at the tip of the nanofiber can be used to excite target molecules bound to the bioreceptor molecules. The fluorescence or surface-enhanced Raman scattering produced by the analyte molecules is detected using an ultra-sensitive photodetector. Objective This article provides an overview of the development and application of fiber-optic nanosensors for drug discovery. Conclusions The nanosensors provide minimally invasive tools to probe sub-cellular compartments inside single living cells for health effect studies (e.g., detection of benzopyrene adducts) and medical applications (e.g., monitoring of apoptosis in cells treated with anti-cancer drugs). PMID:23496274

  15. Network-based discovery through mechanistic systems biology. Implications for applications--SMEs and drug discovery: where the action is.

    Science.gov (United States)

    Benson, Neil

    2015-08-01

    Phase II attrition remains the most important challenge for drug discovery. Tackling the problem requires improved understanding of the complexity of disease biology. Systems biology approaches to this problem can, in principle, deliver this. This article reviews the reports of the application of mechanistic systems models to drug discovery questions and discusses the added value. Although we are on the journey to the virtual human, the length, path and rate of learning from this remain an open question. Success will be dependent on the will to invest and make the most of the insight generated along the way.

  16. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    Energy Technology Data Exchange (ETDEWEB)

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  17. Gene discovery in the horned beetle Onthophagus taurus

    Directory of Open Access Journals (Sweden)

    Yang Youngik

    2010-12-01

    Full Text Available Abstract Background Horned beetles, in particular in the genus Onthophagus, are important models for studies on sexual selection, biological radiations, the origin of novel traits, developmental plasticity, biocontrol, conservation, and forensic biology. Despite their growing prominence as models for studying both basic and applied questions in biology, little genomic or transcriptomic data are available for this genus. We used massively parallel pyrosequencing (Roche 454-FLX platform to produce a comprehensive EST dataset for the horned beetle Onthophagus taurus. To maximize sequence diversity, we pooled RNA extracted from a normalized library encompassing diverse developmental stages and both sexes. Results We used 454 pyrosequencing to sequence ESTs from all post-embryonic stages of O. taurus. Approximately 1.36 million reads assembled into 50,080 non-redundant sequences encompassing a total of 26.5 Mbp. The non-redundant sequences match over half of the genes in Tribolium castaneum, the most closely related species with a sequenced genome. Analyses of Gene Ontology annotations and biochemical pathways indicate that the O. taurus sequences reflect a wide and representative sampling of biological functions and biochemical processes. An analysis of sequence polymorphisms revealed that SNP frequency was negatively related to overall expression level and the number of tissue types in which a given gene is expressed. The most variable genes were enriched for a limited number of GO annotations whereas the least variable genes were enriched for a wide range of GO terms directly related to fitness. Conclusions This study provides the first large-scale EST database for horned beetles, a much-needed resource for advancing the study of these organisms. Furthermore, we identified instances of gene duplications and alternative splicing, useful for future study of gene regulation, and a large number of SNP markers that could be used in population

  18. Gene Expression Data Knowledge Discovery using Global and Local Clustering

    CERN Document Server

    H, Swathi

    2010-01-01

    To understand complex biological systems, the research community has produced huge corpus of gene expression data. A large number of clustering approaches have been proposed for the analysis of gene expression data. However, extracting important biological knowledge is still harder. To address this task, clustering techniques are used. In this paper, hybrid Hierarchical k-Means algorithm is used for clustering and biclustering gene expression data is used. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. A validation technique, Figure of Merit is used to determine the quality of clustering results. Appropriate knowledge is mined from the clusters by embedding a BLAST similarity search program into the clustering and biclustering process. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. To determine the quality of clustering results, a validation technique, Figure of Merit is used. Appropriate ...

  19. Discovery of Novel Gene Elements Associated with Prostate Cancer Progression

    Science.gov (United States)

    2012-10-01

    transcripts more closely, we performed 5’ and 3’ rapid amplification of cDNA ends (RACE) for PCAT-1 and PCAT-14. Interestingly, the PCAT-14 locus...Sequencing Core. RNA-ligase-mediated rapid amplification of cDNA ends (RACE) 5’ and 3’ RACE was performed using the GeneRacer RLM-RACE kit (Invitrogen

  20. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    Institute of Scientific and Technical Information of China (English)

    Cai-Ping Feng; John Mundy

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions,TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also discussed.

  1. Gene Discovery and Functional Analyses in the Model Plant Arabidopsis

    DEFF Research Database (Denmark)

    Feng, Cai-ping; Mundy, J.

    2006-01-01

    The present mini-review describes newer methods and strategies, including transposon and T-DNA insertions, TILLING, Deleteagene, and RNA interference, to functionally analyze genes of interest in the model plant Arabidopsis. The relative advantages and disadvantages of the systems are also...

  2. Histamine receptors and antihistamines: from discovery to clinical applications.

    Science.gov (United States)

    Cataldi, Mauro; Borriello, Francesco; Granata, Francescopaolo; Annunziato, Lucio; Marone, Gianni

    2014-01-01

    The synthesis and the identification of histamine marked a milestone in both pharmacological and immunological research. Since Sir Henry Dale and Patrick Laidlaw described some of its physiological effects in vivo in 1910, histamine has been shown to play a key role in the control of gastric acid secretion and in allergic disorders. Using selective agonists and antagonists, as well as molecular biology tools, four histamine receptors (H1R, H2R, H3R and H4R) have been identified. The Nobel Prize in Physiology and Medicine was awarded to Daniel Bovet in 1957 for the discovery of antihistamines (anti-H1R) and to Sir James Black in 1988 for the identification of anti-H2R antagonists. Anti-H1R and anti-H2R histamine receptor antagonists have revolutionized the treatment of certain allergic disorders and gastric acid-related conditions, respectively. More recently, anti-H3R antagonists have entered early-phase clinical trials for possible application in obesity and a variety of neurologic disorders. The preferential expression of H4R by several immune cells and its involvement in the development of allergic inflammation provide the rationale for the use of anti-H4R antagonists in allergic and in other immune-related disorders.

  3. Africa: the next frontier for human disease gene discovery?

    Science.gov (United States)

    Ramsay, Michèle; Tiemessen, Caroline T; Choudhury, Ananyo; Soodyall, Himla

    2011-10-15

    The populations of Africa harbour the greatest human genetic diversity following an evolutionary history tracing its beginnings on the continent to time before the emergence of Homo sapiens. Signatures of selection are detectable as responses to ancient environments and cultural practices, modulated by more recent events including infectious epidemics, migrations, admixture and, of course, chance. The age of high-throughput biology is not passing Africa by. African-based cohort studies and networks with an African footprint are ideal springboards for disease-related genetic and genomic studies. Initiatives like HapMap, the 1000 Genomes Project, MalariaGEN, the INDEPTH network and Human Heredity and Health in Africa are catalysts to exploring African genetic diversity and its role in the spectrum from health to disease. The challenges are abundant in dissecting biological questions in the light of linguistic, cultural, geographic and political boundaries and their respective roles in shaping health-related profiles. Will studies based on African populations lead to a new wave of discovery of genetic contributors to disease?

  4. Improving functional modules discovery by enriching interaction networks with gene profiles

    KAUST Repository

    Salem, Saeed

    2013-05-01

    Recent advances in proteomic and transcriptomic technologies resulted in the accumulation of vast amount of high-throughput data that span multiple biological processes and characteristics in different organisms. Much of the data come in the form of interaction networks and mRNA expression arrays. An important task in systems biology is functional modules discovery where the goal is to uncover well-connected sub-networks (modules). These discovered modules help to unravel the underlying mechanisms of the observed biological processes. While most of the existing module discovery methods use only the interaction data, in this work we propose, CLARM, which discovers biological modules by incorporating gene profiles data with protein-protein interaction networks. We demonstrate the effectiveness of CLARM on Yeast and Human interaction datasets, and gene expression and molecular function profiles. Experiments on these real datasets show that the CLARM approach is competitive to well established functional module discovery methods.

  5. Genomic discovery of potent chromatin insulators for human gene therapy.

    Science.gov (United States)

    Liu, Mingdong; Maurano, Matthew T; Wang, Hao; Qi, Heyuan; Song, Chao-Zhong; Navas, Patrick A; Emery, David W; Stamatoyannopoulos, John A; Stamatoyannopoulos, George

    2015-02-01

    Insertional mutagenesis and genotoxicity, which usually manifest as hematopoietic malignancy, represent major barriers to realizing the promise of gene therapy. Although insulator sequences that block transcriptional enhancers could mitigate or eliminate these risks, so far no human insulators with high functional potency have been identified. Here we describe a genomic approach for the identification of compact sequence elements that function as insulators. These elements are highly occupied by the insulator protein CTCF, are DNase I hypersensitive and represent only a small minority of the CTCF recognition sequences in the human genome. We show that the elements identified acted as potent enhancer blockers and substantially decreased the risk of tumor formation in a cancer-prone animal model. The elements are small, can be efficiently accommodated by viral vectors and have no detrimental effects on viral titers. The insulators we describe here are expected to increase the safety of gene therapy for genetic diseases.

  6. A Computer-Based Microarray Experiment Design-System for Gene-Regulation Pathway Discovery

    OpenAIRE

    2003-01-01

    This paper reports the methods and evaluation of a computer-based system that recommends microarray experimental design for biologists — causal discovery in Gene Expression data using Expected Value of Experimentation (GEEVE). The GEEVE system uses causal Bayesian networks and generates a decision tree for recommendations.

  7. TILLING in forage grasses for gene discovery and breeding improvement.

    Science.gov (United States)

    Manzanares, Chloe; Yates, Steven; Ruckle, Michael; Nay, Michelle; Studer, Bruno

    2016-09-25

    Mutation breeding has a long-standing history and in some major crop species, many of the most important cultivars have their origin in germplasm generated by mutation induction. For almost two decades, methods for TILLING (Targeting Induced Local Lesions IN Genomes) have been established in model plant species such as Arabidopsis (Arabidopsis thaliana L.), enabling the functional analysis of genes. Recent advances in mutation detection by second generation sequencing technology have brought its utility to major crop species. However, it has remained difficult to apply similar approaches in forage and turf grasses, mainly due to their outbreeding nature maintained by an efficient self-incompatibility system. Starting with a description of the extent to which traditional mutagenesis methods have contributed to crop yield increase in the past, this review focuses on technological approaches to implement TILLING-based strategies for the improvement of forage grass breeding through forward and reverse genetics. We present first results from TILLING in allogamous forage grasses for traits such as stress tolerance and evaluate prospects for rapid implementation of beneficial alleles to forage grass breeding. In conclusion, large-scale induced mutation resources, used for forward genetic screens, constitute a valuable tool to increase the genetic diversity for breeding and can be generated with relatively small investments in forage grasses. Furthermore, large libraries of sequenced mutations can be readily established, providing enhanced opportunities to discover mutations in genes controlling traits of agricultural importance and to study gene functions by reverse genetics.

  8. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery.

    Science.gov (United States)

    Chen, Yang; Xu, Rong

    2015-01-01

    Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cross-species network to integrate human-human, parasite-parasite and human-parasite protein interactions. Then we extended the random walk algorithm on this network, and used known malaria genes as the seeds to find novel candidate genes for malaria. We validated our algorithms using 77 known malaria genes: 14 human genes and 63 parasite genes were ranked averagely within top 2% and top 4%, respectively among human and parasite genomes. We also evaluated our method for predicting novel malaria genes using a set of 27 genes with literature supporting evidence. Our approach ranked 12 genes within top 1% and 24 genes within top 5%. In addition, we demonstrated that top-ranked candied genes were enriched for drug targets, and identified commonalities underlying top-ranked malaria genes through pathway analysis. In summary, the candidate malaria-associated genes predicted by our data-driven approach have the potential to guide genetics-based anti-malaria drug discovery.

  9. Cross-pollination of research findings, although uncommon, may accelerate discovery of human disease genes

    Directory of Open Access Journals (Sweden)

    Duda Marlena

    2012-11-01

    Full Text Available Abstract Background Technological leaps in genome sequencing have resulted in a surge in discovery of human disease genes. These discoveries have led to increased clarity on the molecular pathology of disease and have also demonstrated considerable overlap in the genetic roots of human diseases. In light of this large genetic overlap, we tested whether cross-disease research approaches lead to faster, more impactful discoveries. Methods We leveraged several gene-disease association databases to calculate a Mutual Citation Score (MCS for 10,853 pairs of genetically related diseases to measure the frequency of cross-citation between research fields. To assess the importance of cooperative research, we computed an Individual Disease Cooperation Score (ICS and the average publication rate for each disease. Results For all disease pairs with one gene in common, we found that the degree of genetic overlap was a poor predictor of cooperation (r2=0.3198 and that the vast majority of disease pairs (89.56% never cited previous discoveries of the same gene in a different disease, irrespective of the level of genetic similarity between the diseases. A fraction (0.25% of the pairs demonstrated cross-citation in greater than 5% of their published genetic discoveries and 0.037% cross-referenced discoveries more than 10% of the time. We found strong positive correlations between ICS and publication rate (r2=0.7931, and an even stronger correlation between the publication rate and the number of cross-referenced diseases (r2=0.8585. These results suggested that cross-disease research may have the potential to yield novel discoveries at a faster pace than singular disease research. Conclusions Our findings suggest that the frequency of cross-disease study is low despite the high level of genetic similarity among many human diseases, and that collaborative methods may accelerate and increase the impact of new genetic discoveries. Until we have a better

  10. Context-driven discovery of gene cassettes in mobile integrons using a computational grammar

    Directory of Open Access Journals (Sweden)

    Schaeffer Jaron

    2009-09-01

    Full Text Available Abstract Background Gene discovery algorithms typically examine sequence data for low level patterns. A novel method to computationally discover higher order DNA structures is presented, using a context sensitive grammar. The algorithm was applied to the discovery of gene cassettes associated with integrons. The discovery and annotation of antibiotic resistance genes in such cassettes is essential for effective monitoring of antibiotic resistance patterns and formulation of public health antibiotic prescription policies. Results We discovered two new putative gene cassettes using the method, from 276 integron features and 978 GenBank sequences. The system achieved κ = 0.972 annotation agreement with an expert gold standard of 300 sequences. In rediscovery experiments, we deleted 789,196 cassette instances over 2030 experiments and correctly relabelled 85.6% (α ≥ 95%, E ≤ 1%, mean sensitivity = 0.86, specificity = 1, F-score = 0.93, with no false positives. Error analysis demonstrated that for 72,338 missed deletions, two adjacent deleted cassettes were labeled as a single cassette, increasing performance to 94.8% (mean sensitivity = 0.92, specificity = 1, F-score = 0.96. Conclusion Using grammars we were able to represent heuristic background knowledge about large and complex structures in DNA. Importantly, we were also able to use the context embedded in the model to discover new putative antibiotic resistance gene cassettes. The method is complementary to existing automatic annotation systems which operate at the sequence level.

  11. Metagenomics and novel gene discovery: promise and potential for novel therapeutics.

    Science.gov (United States)

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-04-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics.

  12. Functional Gene Discovery and Characterization of Genes and Alleles Affecting Wood Biomass Yield and Quality in Populus

    Energy Technology Data Exchange (ETDEWEB)

    Busov, Victor [Michigan Technological Univ., Houghton, MI (United States)

    2017-02-12

    Adoption of biofuels as economically and environmentally viable alternative to fossil fuels would require development of specialized bioenergy varieties. A major goal in the breeding of such varieties is the improvement of lignocellulosic biomass yield and quality. These are complex traits and understanding the underpinning molecular mechanism can assist and accelerate their improvement. This is particularly important for tree bioenergy crops like poplars (species and hybrids from the genus Populus), for which breeding progress is extremely slow due to long generation cycles. A variety of approaches have been already undertaken to better understand the molecular bases of biomass yield and quality in poplar. An obvious void in these undertakings has been the application of mutagenesis. Mutagenesis has been instrumental in the discovery and characterization of many plant traits including such that affect biomass yield and quality. In this proposal we use activation tagging to discover genes that can significantly affect biomass associated traits directly in poplar, a premier bioenergy crop. We screened a population of 5,000 independent poplar activation tagging lines under greenhouse conditions for a battery of biomass yield traits. These same plants were then analyzed for changes in wood chemistry using pyMBMS. As a result of these screens we have identified nearly 800 mutants, which are significantly (P<0.05) different when compared to wild type. Of these majority (~700) are affected in one of ten different biomass yield traits and 100 in biomass quality traits (e.g., lignin, S/G ration and C6/C5 sugars). We successfully recovered the position of the tag in approximately 130 lines, showed activation in nearly half of them and performed recapitulation experiments with 20 genes prioritized by the significance of the phenotype. Recapitulation experiments are still ongoing for many of the genes but the results are encouraging. For example, we have shown successful

  13. A New Algorithm of Service Discovery Based on DHT for Mobile Application

    Directory of Open Access Journals (Sweden)

    De-gan Zhang

    2011-10-01

    Full Text Available In order to solve how to enhance the discovery efficiency and coverage, based on DHT (Distributed Hash Table and Small World Theory, we put forward a new algorithm of service discovery for mobile application. In traditional DHT discovery algorithm, each node maintains the finger-table that store node information of adjacent node. By using Small-World Theory, we put forward adding a remote node into the finger-table and adding the corresponding remote index. It is different from selecting the remote connection node randomly. We select the remote connection node by calculating local node and it can assure not only the cove range of service discovery but also not increase the length of finger-table, which simplifies the calculation of the finger-table and maintenance work. The simulation proved that the algorithm can reduce the path length of service discovery effectively, improve success rate of service discovery

  14. Implementation of BacMam virus gene delivery technology in a drug discovery setting.

    Science.gov (United States)

    Kost, Thomas A; Condreay, J Patrick; Ames, Robert S; Rees, Stephen; Romanos, Michael A

    2007-05-01

    Membrane protein targets constitute a key segment of drug discovery portfolios and significant effort has gone into increasing the speed and efficiency of pursuing these targets. However, issues still exist in routine gene expression and stable cell-based assay development for membrane proteins, which are often multimeric or toxic to host cells. To enhance cell-based assay capabilities, modified baculovirus (BacMam virus) gene delivery technology has been successfully applied to the transient expression of target proteins in mammalian cells. Here, we review the development, full implementation and benefits of this platform-based gene expression technology in support of SAR and HTS assays across GlaxoSmithKline.

  15. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    Energy Technology Data Exchange (ETDEWEB)

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  16. Gene-based SNP discovery and genetic mapping in pea.

    Science.gov (United States)

    Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G

    2014-10-01

    Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.

  17. Expert-Guided Subgroup Discovery: Methodology and Application

    CERN Document Server

    Gamberger, D; 10.1613/jair.1089

    2011-01-01

    This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.

  18. [Knowledge discovery in database and its application in clinical diagnosis].

    Science.gov (United States)

    Lui, Hui; Qiu, Tianshuang

    2004-08-01

    Nowadays the tremendous amount of data has far exceeded our human ability for comprehension, and this has been particularly true for the medical database. However, traditional statistical techniques are no longer adequate for analyzing this vast collection of data. Knowledge discovery in database and data mining play an important role in analyzing data and uncovering important data patterns. This paper briefly presents the concepts of knowledge discovery in database and data mining, then describes the rough set theory, and gives some examples based on rough set.

  19. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  20. Literature-Based Discovery of IFN-γ and Vaccine-Mediated Gene Interaction Networks

    Directory of Open Access Journals (Sweden)

    Arzucan Özgür

    2010-01-01

    Full Text Available Interferon-gamma (IFN-γ regulates various immune responses that are often critical for vaccine-induced protection. In order to annotate the IFN-γ-related gene interaction network from a large amount of IFN-γ research reported in the literature, a literature-based discovery approach was applied with a combination of natural language processing (NLP and network centrality analysis. The interaction network of human IFN-γ (Gene symbol: IFNG and its vaccine-specific subnetwork were automatically extracted using abstracts from all articles in PubMed. Four network centrality metrics were further calculated to rank the genes in the constructed networks. The resulting generic IFNG network contains 1060 genes and 26313 interactions among these genes. The vaccine-specific subnetwork contains 102 genes and 154 interactions. Fifty six genes such as TNF, NFKB1, IL2, IL6, and MAPK8 were ranked among the top 25 by at least one of the centrality methods in one or both networks. Gene enrichment analysis indicated that these genes were classified in various immune mechanisms such as response to extracellular stimulus, lymphocyte activation, and regulation of apoptosis. Literature evidence was manually curated for the IFN-γ relatedness of 56 genes and vaccine development relatedness for 52 genes. This study also generated many new hypotheses worth further experimental studies.

  1. Service-oriented discovery of knowledge : foundations, implementations and applications

    NARCIS (Netherlands)

    Bruin, Jeroen Sebastiaan de

    2010-01-01

    In this thesis we will investigate how a popular new way of distributed computing called service orientation can be used within the field of Knowledge Discovery. We critically investigate its principles and present models for developing withing this paradigm. We then apply this model to create a web

  2. Some Applications of Fourier's Great Discovery for Beginners

    Science.gov (United States)

    Kraftmakher, Yaakov

    2012-01-01

    Nearly two centuries ago, Fourier discovered that any periodic function of period T can be presented as a sum of sine waveforms of frequencies equal to an integer times the fundamental frequency [omega] = 2[pi]/T (Fourier's series). It is impossible to overestimate the importance of Fourier's discovery, and all physics or engineering students…

  3. A comparison of methods for data-driven cancer outlier discovery, and an application scheme to semisupervised predictive biomarker discovery.

    Science.gov (United States)

    Karrila, Seppo; Lee, Julian Hock Ean; Tucker-Kellogg, Greg

    2011-04-18

    A core component in translational cancer research is biomarker discovery using gene expression profiling for clinical tumors. This is often based on cell line experiments; one population is sampled for inference in another. We disclose a semisupervised workflow focusing on binary (switch-like, bimodal) informative genes that are likely cancer relevant, to mitigate this non-statistical problem. Outlier detection is a key enabling technology of the workflow, and aids in identifying the focus genes.We compare outlier detection techniques MOST, LSOSS, COPA, ORT, OS, and t-test, using a publicly available NSCLC dataset. Removing genes with Gaussian distribution is computationally efficient and matches MOST particularly well, while also COPA and OS pick prognostically relevant genes in their top ranks. Also our stability assessment is in favour of both MOST and COPA; the latter does not pair well with prefiltering for non-Gaussianity, but can handle data sets lacking non-cancer cases.We provide R code for replicating our approach or extending it.

  4. From mouse to humans: discovery of the CACNG2 pain susceptibility gene.

    Science.gov (United States)

    Nissenbaum, J

    2012-10-01

    Chronic pain is a major healthcare problem affecting the daily lives of millions with enormous financial costs. The notorious variability and lack of efficient pain relief pharmaceuticals provide both genetic and therapeutic challenge. There are several genetic approaches that aim to uncover the molecular nature of pain phenotypes into their genetic components. Gene mapping using model organisms for various pain phenotypes has led to the identification of novel genes affecting susceptibility and response to pain stimuli. Translational studies have succeeded to tie those genes to human pain syndromes, thus suggesting new targets for drug discovery. In this short review, a perspective on pain genetics and the trajectory from pain phenotype to pain gene involving fine-mapping strategies, bioinformatic analysis and microarray profiling alongside human association analysis will be introduced. This integrated approach has led to identification of CACNG2 as a novel neuropathic pain gene affecting pain susceptibility both in mice and humans. It also serves as a prototype for efficient and economic discovery of pain genes. Comparisons to other methods as well as future directions of pain genetics will be discussed as well.

  5. Applications of structure-based design to antibacterial drug discovery.

    Science.gov (United States)

    Cain, Ricky; Narramore, Sarah; McPhillie, Martin; Simmons, Katie; Fishwick, Colin W G

    2014-08-01

    In recent years bacterial resistance has been observed against many of our current antibiotics, for instance most worryingly against the cephalosporins which are typically the last line of defence against many bacterial infections. Additionally the failure of high throughput screening in the discovery of new antibacterial drug leads has led to a decline in the number of antibacterial agents reaching the market. Alternative methods of drug discovery including structure based drug design are needed to meet the threats caused by the emergence of resistance. In this review we explore the latest advancements in the identification of new antibacterial agents through the use of a number of structure based drug design programs. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Response-Guided Community Detection: Application to Climate Index Discovery

    Energy Technology Data Exchange (ETDEWEB)

    Bello, Gonzalo [North Carolina State University (NCSU), Raleigh; Angus, Michael [North Carolina State University (NCSU), Raleigh; Pedemane, Navya [North Carolina State University (NCSU), Raleigh; Harlalka, Jitendra [North Carolina State University (NCSU), Raleigh; Semazzi, Fredrick [North Carolina State University (NCSU), Raleigh; Kumar, Vipin [University of Minnesota; Samatova, Nagiza F [ORNL

    2015-01-01

    Discovering climate indices-time series that summarize spatiotemporal climate patterns-is a key task in the climate science domain. In this work, we approach this task as a problem of response-guided community detection; that is, identifying communities in a graph associated with a response variable of interest. To this end, we propose a general strategy for response-guided community detection that explicitly incorporates information of the response variable during the community detection process, and introduce a graph representation of spatiotemporal data that leverages information from multiple variables. We apply our proposed methodology to the discovery of climate indices associated with seasonal rainfall variability. Our results suggest that our methodology is able to capture the underlying patterns known to be associated with the response variable of interest and to improve its predictability compared to existing methodologies for data-driven climate index discovery and official forecasts.

  7. Transcriptome profiling for discovery of genes involved in shoot apical meristem and flower development

    Directory of Open Access Journals (Sweden)

    Vikash K. Singh

    2014-12-01

    Full Text Available Flower development is one of the major developmental processes that governs seed setting in angiosperms. However, little is known about the molecular mechanisms underlying flower development in legumes. Employing RNA-seq for various stages of flower development and few vegetative tissues in chickpea, we identified differentially expressed genes in flower tissues/stages in comparison to vegetative tissues, which are related to various biological processes and molecular functions during flower development. Here, we provide details of experimental methods, RNA-seq data (available at Gene Expression Omnibus database under GSE42679 and analysis pipeline published by Singh and colleagues in the Plant Biotechnology Journal (Singh et al., 2013, along with additional analysis for discovery of genes involved in shoot apical meristem (SAM development. Our data provide a resource for exploring the complex molecular mechanisms underlying SAM and flower development and identification of gene targets for functional and applied genomics in legumes.

  8. Discovery of the faithfulness gene: a model of transmission and transformation of scientific information.

    Science.gov (United States)

    Green, Eva G T; Clémence, Alain

    2008-09-01

    The purpose of this paper is to study the diffusion and transformation of scientific information in everyday discussions. Based on rumour models and social representations theory, the impact of interpersonal communication and pre-existing beliefs on transmission of the content of a scientific discovery was analysed. In three experiments, a communication chain was simulated to investigate how laypeople make sense of a genetic discovery first published in a scientific outlet, then reported in a mainstream newspaper and finally discussed in groups. Study 1 (N=40) demonstrated a transformation of information when the scientific discovery moved along the communication chain. During successive narratives, scientific expert terminology disappeared while scientific information associated with lay terminology persisted. Moreover, the idea of a discovery of a faithfulness gene emerged. Study 2 (N=70) revealed that transmission of the scientific message varied as a function of attitudes towards genetic explanations of behaviour (pro-genetics vs. anti-genetics). Pro-genetics employed more scientific terminology than anti-genetics. Study 3 (N=75) showed that endorsement of genetic explanations was related to descriptive accounts of the scientific information, whereas rejection of genetic explanations was related to evaluative accounts of the information.

  9. Application of lean manufacturing concepts to drug discovery: rapid analogue library synthesis.

    Science.gov (United States)

    Weller, Harold N; Nirschl, David S; Petrillo, Edward W; Poss, Michael A; Andres, Charles J; Cavallaro, Cullen L; Echols, Martin M; Grant-Young, Katherine A; Houston, John G; Miller, Arthur V; Swann, R Thomas

    2006-01-01

    The application of parallel synthesis to lead optimization programs in drug discovery has been an ongoing challenge since the first reports of library synthesis. A number of approaches to the application of parallel array synthesis to lead optimization have been attempted over the years, ranging from widespread deployment by (and support of) individual medicinal chemists to centralization as a service by an expert core team. This manuscript describes our experience with the latter approach, which was undertaken as part of a larger initiative to optimize drug discovery. In particular, we highlight how concepts taken from the manufacturing sector can be applied to drug discovery and parallel synthesis to improve the timeliness and thus the impact of arrays on drug discovery.

  10. Weighted gene co-expression based biomarker discovery for psoriasis detection.

    Science.gov (United States)

    Sundarrajan, Sudharsana; Arumugam, Mohanapriya

    2016-11-15

    Psoriasis is a chronic inflammatory disease of the skin with an unknown aetiology. The disease manifests itself as red and silvery scaly plaques distributed over the scalp, lower back and extensor aspects of the limbs. After receiving scant consideration for quite a few years, psoriasis has now become a prominent focus for new drug development. A group of closely connected and differentially co-expressed genes may act in a network and may serve as molecular signatures for an underlying phenotype. A weighted gene coexpression network analysis (WGCNA), a system biology approach has been utilized for identification of new molecular targets for psoriasis. Gene coexpression relationships were investigated in 58 psoriatic lesional samples resulting in five gene modules, clustered based on the gene coexpression patterns. The coexpression pattern was validated using three psoriatic datasets. 10 highly connected and informative genes from each module was selected and termed as psoriasis specific hub signatures. A random forest based binary classifier built using the expression profiles of signature genes robustly distinguished psoriatic samples from the normal samples in the validation set with an accuracy of 0.95 to 1. These signature genes may serve as potential candidates for biomarker discovery leading to new therapeutic targets. WGCNA, the network based approach has provided an alternative path to mine out key controllers and drivers of psoriasis. The study principle from the current work can be extended to other pathological conditions.

  11. Ontological Discovery Environment: a system for integrating gene-phenotype associations.

    Science.gov (United States)

    Baker, Erich J; Jay, Jeremy J; Philip, Vivek M; Zhang, Yun; Li, Zuopan; Kirova, Roumyana; Langston, Michael A; Chesler, Elissa J

    2009-12-01

    The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental

  12. Deep data: discovery and visualization Application to hyperspectral ALMA imagery

    Science.gov (United States)

    Merényi, Erzsébet; Taylor, Joshua; Isella, Andrea

    2017-06-01

    Leading-edge telescopes such as the Atacama Large Millimeter and sub-millimeter Array (ALMA), and near-future ones, are capable of imaging the same sky area at hundreds-to-thousands of frequencies with both high spectral and spatial resolution. This provides unprecedented opportunities for discovery about the spatial, kinematical and compositional structure of sources such as molecular clouds or protoplanetary disks, and more. However, in addition to enormous volume, the data also exhibit unprecedented complexity, mandating new approaches for extracting and summarizing relevant information. Traditional techniques such as examining images at selected frequencies become intractable while tools that integrate data across frequencies or pixels (like moment maps) can no longer fully exploit and visualize the rich information. We present a neural map-based machine learning approach that can handle all spectral channels simultaneously, utilizing the full depth of these data for discovery and visualization of spectrally homogeneous spatial regions (spectral clusters) that characterize distinct kinematic behaviors. We demonstrate the effectiveness on an ALMA image cube of the protoplanetary disk HD142527. The tools we collectively name ``NeuroScope'' are efficient for ``Big Data'' due to intelligent data summarization that results in significant sparsity and noise reduction. We also demonstrate a new approach to automate our clustering for fast distillation of large data cubes.

  13. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    Directory of Open Access Journals (Sweden)

    Nadine Norton

    Full Text Available Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16. Specifically for lincRNAs, we observed superb Pearson correlation (0.988 between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads. Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol

  14. Testing Rich Internet Applications: The MAST Discovery Portal

    Science.gov (United States)

    Quick, L.

    2014-05-01

    Testing Rich Internet Applications (RIA) provides unique challenges to the overall success of a web application. Validating data driven, data intensive, dynamic, realtime events is imperative. In order to effectively and efficiently test such applications one needs to understand how, when and why different test types should be applied. Test types include application functionality, browser compatibility, data and database functions, performance and scalability, and regression testing.

  15. Immunologic applications of conditional gene modification technology in the mouse.

    Science.gov (United States)

    Sharma, Suveena; Zhu, Jinfang

    2014-04-02

    Since the success of homologous recombination in altering mouse genome and the discovery of Cre-loxP system, the combination of these two breakthroughs has created important applications for studying the immune system in the mouse. Here, we briefly summarize the general principles of this technology and its applications in studying immune cell development and responses; such implications include conditional gene knockout and inducible and/or tissue-specific gene over-expression, as well as lineage fate mapping. We then discuss the pros and cons of a few commonly used Cre-expressing mouse lines for studying lymphocyte development and functions. We also raise several general issues, such as efficiency of gene deletion, leaky activity of Cre, and Cre toxicity, all of which may have profound impacts on data interpretation. Finally, we selectively list some useful links to the Web sites as valuable mouse resources.

  16. Meiosis-specific gene discovery in plants: RNA-Seq applied to isolated Arabidopsis male meiocytes

    Directory of Open Access Journals (Sweden)

    May Gregory D

    2010-12-01

    Full Text Available Abstract Background Meiosis is a critical process in the reproduction and life cycle of flowering plants in which homologous chromosomes pair, synapse, recombine and segregate. Understanding meiosis will not only advance our knowledge of the mechanisms of genetic recombination, but also has substantial applications in crop improvement. Despite the tremendous progress in the past decade in other model organisms (e.g., Saccharomyces cerevisiae and Drosophila melanogaster, the global identification of meiotic genes in flowering plants has remained a challenge due to the lack of efficient methods to collect pure meiocytes for analyzing the temporal and spatial gene expression patterns during meiosis, and for the sensitive identification and quantitation of novel genes. Results A high-throughput approach to identify meiosis-specific genes by combining isolated meiocytes, RNA-Seq, bioinformatic and statistical analysis pipelines was developed. By analyzing the studied genes that have a meiosis function, a pipeline for identifying meiosis-specific genes has been defined. More than 1,000 genes that are specifically or preferentially expressed in meiocytes have been identified as candidate meiosis-specific genes. A group of 55 genes that have mitochondrial genome origins and a significant number of transposable element (TE genes (1,036 were also found to have up-regulated expression levels in meiocytes. Conclusion These findings advance our understanding of meiotic genes, gene expression and regulation, especially the transcript profiles of MGI genes and TE genes, and provide a framework for functional analysis of genes in meiosis.

  17. Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

    LENUS (Irish Health Repository)

    OhEigeartaigh, Sean S

    2011-07-26

    Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external

  18. A computer-based microarray experiment design-system for gene-regulation pathway discovery.

    Science.gov (United States)

    Yoo, Changwon; Cooper, Gregory F

    2003-01-01

    This paper reports the methods and evaluation of a computer-based system that recommends microarray experimental design for biologists - causal discovery in Gene Expression data using Expected Value of Experimentation (GEEVE). The GEEVE system uses causal Bayesian networks and generates a decision tree for recommendations. To evaluate the GEEVE system, we first built an expression simulation model based on a gene regulation model assessed by an expert biologist. Using the simulation model, we conducted a controlled study that involved 10 biologists, some of whom used GEEVE and some of whom did not. The results show that biologists who used GEEVE reached correct causal assessments about gene regulation more often than did those biologists who did not use GEEVE.

  19. Systems Pharmacology‐Based Discovery of Natural Products for Precision Oncology Through Targeting Cancer Mutated Genes

    Science.gov (United States)

    Fang, J; Cai, C; Wang, Q; Lin, P

    2017-01-01

    Massive cancer genomics data have facilitated the rapid revolution of a novel oncology drug discovery paradigm through targeting clinically relevant driver genes or mutations for the development of precision oncology. Natural products with polypharmacological profiles have been demonstrated as promising agents for the development of novel cancer therapies. In this study, we developed an integrated systems pharmacology framework that facilitated identifying potential natural products that target mutated genes across 15 cancer types or subtypes in the realm of precision medicine. High performance was achieved for our systems pharmacology framework. In case studies, we computationally identified novel anticancer indications for several US Food and Drug Administration‐approved or clinically investigational natural products (e.g., resveratrol, quercetin, genistein, and fisetin) through targeting significantly mutated genes in multiple cancer types. In summary, this study provides a powerful tool for the development of molecularly targeted cancer therapies through targeting the clinically actionable alterations by exploiting the systems pharmacology of natural products. PMID:28294568

  20. Applications of genome editing tools in drug discovery and basic research

    OpenAIRE

    2015-01-01

    Since the discovery of the DNA double helix, major advances in biology have been; the development of recombinant DNA technology in the 1970s, methods to amplify DNA and gene targeting technology in the late 1980s. In organisms such as yeast and mice, the ability to accurately add or delete genetic information transformed biology, allowing an unmatched level of precision in studies of gene function. But, the ability to easily and specifically edit the genetic material of other cells and organi...

  1. A Review of Whole-Exome Sequencing Efforts Toward Hereditary Breast Cancer Susceptibility Gene Discovery.

    Science.gov (United States)

    Chandler, Madison R; Bilgili, Erin P; Merner, Nancy D

    2016-09-01

    Inherited genetic risk factors contribute toward breast cancer (BC) onset. BC risk variants can be divided into three categories of penetrance (high, moderate, and low) that reflect the probability of developing the disease. Traditional BC susceptibility gene discovery approaches that searched for high- and moderate-risk variants in familial BC cases have had limited success; to date, these risk variants explain only ∼30% of familial BC cases. Next-generation sequencing technologies can be used to search for novel high and moderate BC risk variants, and this manuscript reviews 12 familial BC whole-exome sequencing efforts. Study design, filtering strategies, and segregation and validation analyses are discussed. Overall, only a modest number of novel BC risk genes were identified, and 90% and 97% of the exome-sequenced families and cases, respectively, had no BC risk variants reported. It is important to learn from these studies and consider alternate strategies in order to make further advances. The discovery of new BC susceptibility genes is critical for improved risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies.

  2. FORGE Canada Consortium: outcomes of a 2-year national rare-disease gene-discovery project.

    Science.gov (United States)

    Beaulieu, Chandree L; Majewski, Jacek; Schwartzentruber, Jeremy; Samuels, Mark E; Fernandez, Bridget A; Bernier, Francois P; Brudno, Michael; Knoppers, Bartha; Marcadier, Janet; Dyment, David; Adam, Shelin; Bulman, Dennis E; Jones, Steve J M; Avard, Denise; Nguyen, Minh Thu; Rousseau, Francois; Marshall, Christian; Wintle, Richard F; Shen, Yaoqing; Scherer, Stephen W; Friedman, Jan M; Michaud, Jacques L; Boycott, Kym M

    2014-06-01

    Inherited monogenic disease has an enormous impact on the well-being of children and their families. Over half of the children living with one of these conditions are without a molecular diagnosis because of the rarity of the disease, the marked clinical heterogeneity, and the reality that there are thousands of rare diseases for which causative mutations have yet to be identified. It is in this context that in 2010 a Canadian consortium was formed to rapidly identify mutations causing a wide spectrum of pediatric-onset rare diseases by using whole-exome sequencing. The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers from across Canada. From nation-wide requests for proposals, 264 disorders were selected for study from the 371 submitted; disease-causing variants (including in 67 genes not previously associated with human disease; 41 of these have been genetically or functionally validated, and 26 are currently under study) were identified for 146 disorders over a 2-year period. Here, we present our experience with four strategies employed for gene discovery and discuss FORGE's impact in a number of realms, from clinical diagnostics to the broadening of the phenotypic spectrum of many diseases to the biological insight gained into both disease states and normal human development. Lastly, on the basis of this experience, we discuss the way forward for rare-disease genetic discovery both in Canada and internationally.

  3. SDAA: Towards Service Discovery Anywhere Anytime Mobile Based Application

    Directory of Open Access Journals (Sweden)

    Mehedi Masud

    2016-01-01

    Full Text Available Providing on-demand service based on customers' current location is an urgent need for many societies and individuals. Specially, for woman, elderly people, single mother, sick people, etc. Considering the need of providing localized services, this paper proposes a mobile application framework that allows an individual to receive services from his neighborhood peers anywhere anytime. The application allows an individual to find and select reliable service providers near his location. The application will provide an opportunity to the interested individuals to use their free time for providing services to the community and earn some extra money. This application will benefit many stakeholders like elderly people, women at home, a person while traveling in an unknown place, etc. A prototype application is developed and empirical evaluation is considered to find the qualitative measures of the users' acceptability and satisfaction of the application. It is observed that users' satisfaction is high.

  4. MAGIC Database and Interfaces: An Integrated Package for Gene Discovery and Expression

    Directory of Open Access Journals (Sweden)

    Lee H. Pratt

    2006-03-01

    Full Text Available The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs, and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.

  5. UPLC-MS(E) application in disease biomarker discovery: the discoveries in proteomics to metabolomics.

    Science.gov (United States)

    Zhao, Ying-Yong; Lin, Rui-Chao

    2014-05-25

    In the last decade, proteomics and metabolomics have contributed substantially to our understanding of different diseases. Proteomics and metabolomics aims to comprehensively identify proteins and metabolites to gain insight into the cellular signaling pathways underlying disease and to discover novel biomarkers for screening, early detection and diagnosis, as well as for determining prognoses and predicting responses to specific treatments. For comprehensive analysis of cellular proteins and metabolites, analytical methods of wider dynamic range higher resolution and good sensitivity are required. Ultra performance liquid chromatography-mass spectrometry(Elevated Energy) (UPLC-MS(E)) is currently one of the most versatile techniques. UPLC-MS(E) is an established technology in proteomics studies and is now expanding into metabolite research. MS(E) was used for simultaneous acquisition of precursor ion information and fragment ion data at low and high collision energy in one analytical run, providing similar information to conventional MS(2). In this review, UPLC-MS(E) application in proteomics and metabolomics was highlighted to assess protein and metabolite changes in different diseases, including cancer, neuropsychiatric pharmacology studies from clinical trials and animal models. In addition, the future prospects for complete proteomics and metabolomics are discussed. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  6. Abiotic Stress Tolerance: From Gene Discovery in Model Organisms to Crop Improvement

    Institute of Scientific and Technical Information of China (English)

    Ray Bressan; Hans Bohnert; Jian-Kang Zhu

    2009-01-01

    Productive and sustainable agriculture necessitates growing plants in sub-optimal environments with less input of precious resources such as fresh water. For a better understanding and rapid improvement of abiotic stress tolerance, it is important to link physiological and biochemical work to molecular studies in genetically tractable model organisms. With the use of several technologies for the discovery of stress tolerance genes and their appropriate alleles,transgenic approaches to improving stress tolerance in crops remarkably parallels breeding principles with a greatly expanded germplasm base and will succeed eventually.

  7. RNA-Seq analysis and gene discovery of Andrias davidianus using Illumina short read sequencing.

    Directory of Open Access Journals (Sweden)

    Fenggang Li

    Full Text Available The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp. Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR. The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians.

  8. Inherited retinal diseases in dogs: advances in gene/mutation discovery.

    Science.gov (United States)

    Miyadera, Keiko

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed.

  9. Dynamic Service Discovery and Composition for Ubiquitous Networks Applications

    NARCIS (Netherlands)

    Bonino da Silva Santos, L.O.; Sinderen, van M.J.; Ferreira Pires, L.

    2006-01-01

    The realization of ubiquitous networks brings new challenges to application development. In this kind of network, services and, more specifically web services, have been used to provide the functionality required by its users and applications. In such environments features like automatic service dis

  10. Evaluation of gene association methods for coexpression network construction and biological knowledge discovery.

    Directory of Open Access Journals (Sweden)

    Sapna Kumari

    Full Text Available BACKGROUND: Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. METHODS AND RESULTS: In this study, we compared eight gene association methods - Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson - and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. CONCLUSIONS: We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.

  11. Central Nervous System Multiparameter Optimization Desirability: Application in Drug Discovery.

    Science.gov (United States)

    Wager, Travis T; Hou, Xinjun; Verhoest, Patrick R; Villalobos, Anabella

    2016-06-15

    Significant progress has been made in prospectively designing molecules using the central nervous system multiparameter optimization (CNS MPO) desirability tool, as evidenced by the analysis reported herein of a second wave of drug candidates that originated after the development and implementation of this tool. This simple-to-use design algorithm has expanded design space for CNS candidates and has further demonstrated the advantages of utilizing a flexible, multiparameter approach in drug discovery rather than individual parameters and hard cutoffs of physicochemical properties. The CNS MPO tool has helped to increase the percentage of compounds nominated for clinical development that exhibit alignment of ADME attributes, cross the blood-brain barrier, and reside in lower-risk safety space (low ClogP and high TPSA). The use of this tool has played a role in reducing the number of compounds submitted to exploratory toxicity studies and increasing the survival of our drug candidates through regulatory toxicology into First in Human studies. Overall, the CNS MPO algorithm has helped to improve the prioritization of design ideas and the quality of the compounds nominated for clinical development.

  12. TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

  13. Biomedical discovery acceleration, with applications to craniofacial development.

    Science.gov (United States)

    Leach, Sonia M; Tipney, Hannah; Feng, Weiguo; Baumgartner, William A; Kasliwal, Priyanka; Schuyler, Ronald P; Williams, Trevor; Spritz, Richard A; Hunter, Lawrence

    2009-03-01

    The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.

  14. Biomedical discovery acceleration, with applications to craniofacial development.

    Directory of Open Access Journals (Sweden)

    Sonia M Leach

    2009-03-01

    Full Text Available The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.

  15. Discovery of Putative Herbicide Resistance Genes and Its Regulatory Network in Chickpea Using Transcriptome Sequencing

    Directory of Open Access Journals (Sweden)

    Mir A. Iquebal

    2017-06-01

    Full Text Available Background: Chickpea (Cicer arietinum L. contributes 75% of total pulse production. Being cheaper than animal protein, makes it important in dietary requirement of developing countries. Weed not only competes with chickpea resulting into drastic yield reduction but also creates problem of harboring fungi, bacterial diseases and insect pests. Chemical approach having new herbicide discovery has constraint of limited lead molecule options, statutory regulations and environmental clearance. Through genetic approach, transgenic herbicide tolerant crop has given successful result but led to serious concern over ecological safety thus non-transgenic approach like marker assisted selection is desirable. Since large variability in tolerance limit of herbicide already exists in chickpea varieties, thus the genes offering herbicide tolerance can be introgressed in variety improvement programme. Transcriptome studies can discover such associated key genes with herbicide tolerance in chickpea.Results: This is first transcriptomic studies of chickpea or even any legume crop using two herbicide susceptible and tolerant genotypes exposed to imidazoline (Imazethapyr. Approximately 90 million paired-end reads generated from four samples were processed and assembled into 30,803 contigs using reference based assembly. We report 6,310 differentially expressed genes (DEGs, of which 3,037 were regulated by 980 miRNAs, 1,528 transcription factors associated with 897 DEGs, 47 Hub proteins, 3,540 putative Simple Sequence Repeat-Functional Domain Marker (SSR-FDM, 13,778 genic Single Nucleotide Polymorphism (SNP putative markers and 1,174 Indels. Randomly selected 20 DEGs were validated using qPCR. Pathway analysis suggested that xenobiotic degradation related gene, glutathione S-transferase (GST were only up-regulated in presence of herbicide. Down-regulation of DNA replication genes and up-regulation of abscisic acid pathway genes were observed. Study further reveals

  16. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  17. [Analysis of the halogenase gene in actinomycetes from different habitats and its implications for halometabolite discovery].

    Science.gov (United States)

    Gao, Peng; Xi, Lijun; Piao, Yuhua; Ruan, Jisheng; Huang, Ying

    2009-10-01

    To compare the halometabolite producing capability between actinomycetes of earth origin and marine origin, based on genetic screening of the 1,5-dihydroflavin adenine dinucleotide (FADH2-dependent) halogenase gene. We used 141 actinomycete isolates that were dereplicated by phenotype, 70 of earth origin and 71 of marine origin, and obtained halogenase gene fragments from them by PCR screening. We then sequenced the PCR products and analyzed corresponding amino acid sequences phylogenetically. We made further comparison of the halogenase sequences between actinomycetes of different origins, and between marine-origin streptomycetes and marine-origin Micromonospora isolates. In addition, we detected polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes by PCR in the halogenase gene-positive isolates. We observed higher occurrence of the halogenase gene in marine-origin actinomycetes (36.6%) than in earth-origin actinomycetes (14.3%), and in marine-origin streptomycetes (69.0%) than in marine-origin Micromonospora isolates (14.3%). Most (86.1%) of the halogenase gene-positive isolates contained PKS and/or NRPS genes. Moreover, the halogenase sequences of marine-origin isolates differed largely from the known ones, and clustered into a couple of distinct clades in the phylogenetic tree. In addition, we found greater diversity of the halogenase genes in marine-origin Micromonospora isolates than in marine-origin streptomycetes. Based on the results of this study, we propose that actinomycetes, especially streptomycetes, from marine habitat could serve as a good source for new bioactive halometabolite discovery in the future.

  18. Common characteristics of open source software development and applicability for drug discovery: a systematic review.

    Science.gov (United States)

    Ardal, Christine; Alstadsæter, Annette; Røttingen, John-Arne

    2011-09-28

    Innovation through an open source model has proven to be successful for software development. This success has led many to speculate if open source can be applied to other industries with similar success. We attempt to provide an understanding of open source software development characteristics for researchers, business leaders and government officials who may be interested in utilizing open source innovation in other contexts and with an emphasis on drug discovery. A systematic review was performed by searching relevant, multidisciplinary databases to extract empirical research regarding the common characteristics and barriers of initiating and maintaining an open source software development project. Common characteristics to open source software development pertinent to open source drug discovery were extracted. The characteristics were then grouped into the areas of participant attraction, management of volunteers, control mechanisms, legal framework and physical constraints. Lastly, their applicability to drug discovery was examined. We believe that the open source model is viable for drug discovery, although it is unlikely that it will exactly follow the form used in software development. Hybrids will likely develop that suit the unique characteristics of drug discovery. We suggest potential motivations for organizations to join an open source drug discovery project. We also examine specific differences between software and medicines, specifically how the need for laboratories and physical goods will impact the model as well as the effect of patents.

  19. Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Directory of Open Access Journals (Sweden)

    Guo Zheng

    2006-01-01

    Full Text Available Abstract Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network to address the underlying regulations of genes that can span any unit(s of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex

  20. Semi-Automated Discovery of Application Session Structure

    Energy Technology Data Exchange (ETDEWEB)

    Kannan, J.; Jung, J.; Paxson, V.; Koksal, C.

    2006-09-07

    While the problem of analyzing network traffic at the granularity of individual connections has seen considerable previous work and tool development, understanding traffic at a higher level---the structure of user-initiated sessions comprised of groups of related connections---remains much less explored. Some types of session structure, such as the coupling between an FTP control connection and the data connections it spawns, have prespecified forms, though the specifications do not guarantee how the forms appear in practice. Other types of sessions, such as a user reading email with a browser, only manifest empirically. Still other sessions might exist without us even knowing of their presence, such as a botnet zombie receiving instructions from its master and proceeding in turn to carry them out. We present algorithms rooted in the statistics of Poisson processes that can mine a large corpus of network connection logs to extract the apparent structure of application sessions embedded in the connections. Our methods are semi-automated in that we aim to present an analyst with high-quality information (expressed as regular expressions) reflecting different possible abstractions of an application's session structure. We develop and test our methods using traces from a large Internet site, finding diversity in the number of applications that manifest, their different session structures, and the presence of abnormal behavior. Our work has applications to traffic characterization and monitoring, source models for synthesizing network traffic, and anomaly detection.

  1. High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome

    Directory of Open Access Journals (Sweden)

    Pappas Georgios J

    2008-06-01

    Full Text Available Abstract Background Benefits from high-throughput sequencing using 454 pyrosequencing technology may be most apparent for species with high societal or economic value but few genomic resources. Rapid means of gene sequence and SNP discovery using this novel sequencing technology provide a set of baseline tools for genome-level research. However, it is questionable how effective the sequencing of large numbers of short reads for species with essentially no prior gene sequence information will support contig assemblies and sequence annotation. Results With the purpose of generating the first broad survey of gene sequences in Eucalyptus grandis, the most widely planted hardwood tree species, we used 454 technology to sequence and assemble 148 Mbp of expressed sequences (EST. EST sequences were generated from a normalized cDNA pool comprised of multiple tissues and genotypes, promoting discovery of homologues to almost half of Arabidopsis genes, and a comprehensive survey of allelic variation in the transcriptome. By aligning the sequencing reads from multiple genotypes we detected 23,742 SNPs, 83% of which were validated in a sample. Genome-wide nucleotide diversity was estimated for 2,392 contigs using a modified theta (θ parameter, adapted for measuring genetic diversity from polymorphisms detected by randomly sequencing a multi-genotype cDNA pool. Diversity estimates in non-synonymous nucleotides were on average 4x smaller than in synonymous, suggesting purifying selection. Non-synonymous to synonymous substitutions (Ka/Ks among 2,001 contigs averaged 0.30 and was skewed to the right, further supporting that most genes are under purifying selection. Comparison of these estimates among contigs identified major functional classes of genes under purifying and diversifying selection in agreement with previous researches. Conclusion In providing an abundance of foundational transcript sequences where limited prior genomic information existed, this

  2. Effector genomics accelerates discovery and functional profiling of potato disease resistance and phytophthora infestans avirulence genes.

    Directory of Open Access Journals (Sweden)

    Vivianne G A A Vleeshouwers

    Full Text Available Potato is the world's fourth largest food crop yet it continues to endure late blight, a devastating disease caused by the Irish famine pathogen Phytophthora infestans. Breeding broad-spectrum disease resistance (R genes into potato (Solanum tuberosum is the best strategy for genetically managing late blight but current approaches are slow and inefficient. We used a repertoire of effector genes predicted computationally from the P. infestans genome to accelerate the identification, functional characterization, and cloning of potentially broad-spectrum R genes. An initial set of 54 effectors containing a signal peptide and a RXLR motif was profiled for activation of innate immunity (avirulence or Avr activity on wild Solanum species and tentative Avr candidates were identified. The RXLR effector family IpiO induced hypersensitive responses (HR in S. stoloniferum, S. papita and the more distantly related S. bulbocastanum, the source of the R gene Rpi-blb1. Genetic studies with S. stoloniferum showed cosegregation of resistance to P. infestans and response to IpiO. Transient co-expression of IpiO with Rpi-blb1 in a heterologous Nicotiana benthamiana system identified IpiO as Avr-blb1. A candidate gene approach led to the rapid cloning of S. stoloniferum Rpi-sto1 and S. papita Rpi-pta1, which are functionally equivalent to Rpi-blb1. Our findings indicate that effector genomics enables discovery and functional profiling of late blight R genes and Avr genes at an unprecedented rate and promises to accelerate the engineering of late blight resistant potato varieties.

  3. Coherent X-ray mirage: discovery and possible applications

    Institute of Scientific and Technical Information of China (English)

    Tatiana; Pikuz; Anatoly; Faenov; Sergey; Magnitskiy; Nikolay; Nagorskiy; Momoko; Tanaka; Masahiko; Ishino; Masaharu; Nishikino; Yuji; Fukuda; Masaki; Kando; Yoshiaki; Kato; Tetsuya; Kawachi

    2014-01-01

    In the far field of the intensity distribution of the beam delivered by a two-stage transient–collisional excitation X-ray laser(XRL), a non-expected interference pattern that is stable from shot to shot has been discovered. It is demonstrated that the interference is caused by the emergence of an imaginary source in the amplifying plasma, which is phase matched to the radiation of the generator. The observed phenomenon is called an X-ray coherent mirage. To explain the obtained results, a new theoretical approach is developed. The basic essential conditions for formation of the X-ray mirage are formulated, and possible applications are discussed. This paper details the experiments, including the formulation of the necessary and sufficient conditions for formation of the X-ray mirage, and possible applications are discussed.

  4. Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes

    Directory of Open Access Journals (Sweden)

    Goddard Mike E

    2004-03-01

    Full Text Available Abstract The ordinary-, penalized-, and bootstrap t-test, least squares and best linear unbiased prediction were compared for their false discovery rates (FDR, i.e. the fraction of falsely discovered genes, which was empirically estimated in a duplicate of the data set. The bootstrap-t-test yielded up to 80% lower FDRs than the alternative statistics, and its FDR was always as good as or better than any of the alternatives. Generally, the predicted FDR from the bootstrapped P-values agreed well with their empirical estimates, except when the number of mRNA samples is smaller than 16. In a cancer data set, the bootstrap-t-test discovered 200 differentially regulated genes at a FDR of 2.6%, and in a knock-out gene expression experiment 10 genes were discovered at a FDR of 3.2%. It is argued that, in the case of microarray data, control of the FDR takes sufficient account of the multiple testing, whilst being less stringent than Bonferoni-type multiple testing corrections. Extensions of the bootstrap simulations to more complicated test-statistics are discussed.

  5. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    Science.gov (United States)

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  6. In Vitro Transcription Assays and Their Application in Drug Discovery.

    Science.gov (United States)

    Yang, Xiao; Ma, Cong

    2016-09-20

    In vitro transcription assays have been developed and widely used for many years to study the molecular mechanisms involved in transcription. This process requires multi-subunit DNA-dependent RNA polymerase (RNAP) and a series of transcription factors that act to modulate the activity of RNAP during gene expression. Sequencing gel electrophoresis of radiolabeled transcripts is used to provide detailed mechanistic information on how transcription proceeds and what parameters can affect it. In this paper we describe the protocol to study how the essential elongation factor NusA regulates transcriptional pausing, as well as a method to identify an antibacterial agent targeting transcription initiation through inhibition of RNAP holoenzyme formation. These methods can be used a as platform for the development of additional approaches to explore the mechanism of action of the transcription factors which still remain unclear, as well as new antibacterial agents targeting transcription which is an underutilized drug target in antibiotic research and development.

  7. Discovery and analysis of pancreatic adenocarcinoma genes using cDNA microarrays

    Institute of Scientific and Technical Information of China (English)

    Gang Jin; Xian-Gui Hu; Kang Ying; Yan Tang; Rui Liu; Yi-Jie Zhang; Zai-Ping Jing; Yi Xie; Yu-Min Mao

    2005-01-01

    AIM: To study the pathogenetic processes and the role of gene expression by microarray analyses in expediting our understanding of the molecular pathophysiology of pancreatic adenocarcinoma, and to identify the novel cancer-associated genes.METHODS: Nine histologically defined pancreatic head adenocarcinoma specimens associated with clinical data were studied. Total RNA and mRNA were isolated and labeled by reverse transcription reaction with Cy5 and Cy3 for cDNA probe. The cDNA microarrays that represent a set of 4 096 human genes were hybridized with labeled cDNA probe and screened for molecular profiling analyses.RESULTS: Using this methodology, 184 genes were screened out for differences in gene expression level after nine couples of hybridizations. Of the 184 genes,87 were upregulated and 97 downregulated, including 11 novel human genes. In pancreatic adenocarcinoma tissue, several invasion and metastasis related genes showed their high expression levels, suggesting that poor prognosis of pancreatic adenocarcinoma might have a solid molecular biological basis.CONCLUSION: The application of cDNA microarray technique for analysis of gene expression patterns is a powerful strategy to identify novel cancer-associated genes, and to rapidly explore their role in clinical pancreatic adenocarcinoma. Microarray profiles provide us new insights into the carcinogenesis and invasive process of pancreatic adenocarcinoma. Our results suggest that a highly organized and structured process of tumor invasion exists in the pancreas.

  8. Research of united model of knowledge discovery state space and its application

    Institute of Scientific and Technical Information of China (English)

    You Fucheng; Song Wei; Yang Bingru

    2005-01-01

    There are both associations and differences between structured and unstructured data mining. How to unite them together to be a united theoretical framework and to guide the research of knowledge discovery and data mining has become an urgent problem to be solved. On the base of analysis and study of existing research results, the united model of knowledge discovery state space (UMKDSS) is presented, and the structured data mining and the complex type data mining are associated together. UMKDSS can provide theoretical guidance for complex type data mining. An application example of UMKDSS is given at last.

  9. Mathematical Tools for Discovery of Nanoporous Materials for Energy Applications

    Science.gov (United States)

    Haranczyk, M.; Martin, R. L.

    2015-01-01

    Porous materials such as zeolites and metal organic frameworks have been of growing importance as materials for energy-related applications such as CO2 capture, hydrogen and methane storage, and catalysis. The current state-of-the-art molecular simulations allow for accurate in silico prediction of materials' properties but the computational cost of such calculations prohibits their application in the characterisation of very large sets of structures, which would be required to perform brute-force screening. Our work focuses on the development of novel methodologies to efficiently characterize and explore this complex materials space. In particular, we have been developing algorithms and tools for enumeration and characterisation of porous material databases as well as efficient screening approaches. Our methodology represents a ensemble of mathematical methods. We have used Voronoi tessellation-based techniques to enable high-throughput structure characterisation, statistical techniques to perform comparison and screening, and continuous optimisation to design materials. This article outlines our developments in material design.

  10. Leveraging gene-environment interactions and endotypes for asthma gene discovery.

    Science.gov (United States)

    Bønnelykke, Klaus; Ober, Carole

    2016-03-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs and the advantages of studying responses to asthma-associated exposures in clinical birth cohorts, as well as in cell models of GEIs, to dissect the context-specific nature of genotypic risks, to prioritize variants in genome-wide association studies, and to identify pathways involved in pathogenesis in subgroups of patients. We propose that such approaches, in spite of their many challenges, present great opportunities for better understanding of asthma pathogenesis and heterogeneity and, ultimately, for improving prevention and treatment of disease.

  11. A comparative review of estimates of the proportion unchanged genes and the false discovery rate

    Directory of Open Access Journals (Sweden)

    Broberg Per

    2005-08-01

    Full Text Available Abstract Background In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these p-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the p-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available. Results A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error. Conclusion The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information

  12. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    Science.gov (United States)

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.

  13. Screening applications in drug discovery based on microfluidic technology.

    Science.gov (United States)

    Eribol, P; Uguz, A K; Ulgen, K O

    2016-01-01

    Microfluidics has been the focus of interest for the last two decades for all the advantages such as low chemical consumption, reduced analysis time, high throughput, better control of mass and heat transfer, downsizing a bench-top laboratory to a chip, i.e., lab-on-a-chip, and many others it has offered. Microfluidic technology quickly found applications in the pharmaceutical industry, which demands working with leading edge scientific and technological breakthroughs, as drug screening and commercialization are very long and expensive processes and require many tests due to unpredictable results. This review paper is on drug candidate screening methods with microfluidic technology and focuses specifically on fabrication techniques and materials for the microchip, types of flow such as continuous or discrete and their advantages, determination of kinetic parameters and their comparison with conventional systems, assessment of toxicities and cytotoxicities, concentration generations for high throughput, and the computational methods that were employed. An important conclusion of this review is that even though microfluidic technology has been in this field for around 20 years there is still room for research and development, as this cutting edge technology requires ingenuity to design and find solutions for each individual case. Recent extensions of these microsystems are microengineered organs-on-chips and organ arrays.

  14. Gene Discovery for Synthetic Biology: Exploring the Novel Natural Product Biosynthetic Capacity of Eukaryotic Microalgae.

    Science.gov (United States)

    O'Neill, E C; Saalbach, G; Field, R A

    2016-01-01

    Eukaryotic microalgae are an incredibly diverse group of organisms whose sole unifying feature is their ability to photosynthesize. They are known for producing a range of potent toxins, which can build up during harmful algal blooms causing damage to ecosystems and fisheries. Genome sequencing is lagging behind in these organisms because of their genetic complexity, but transcriptome sequencing is beginning to make up for this deficit. As more sequence data becomes available, it is apparent that eukaryotic microalgae possess a range of complex natural product biosynthesis capabilities. Some of the genes concerned are responsible for the biosynthesis of known toxins, but there are many more for which we do not know the products. Bioinformatic and analytical techniques have been developed for natural product discovery in bacteria and these approaches can be used to extract information about the products synthesized by algae. Recent analyses suggest that eukaryotic microalgae produce many complex natural products that remain to be discovered.

  15. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery

    Science.gov (United States)

    Moriarity, Branden S; Largaespada, David A

    2016-01-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  16. Advances and applications of binding affinity prediction methods in drug discovery.

    Science.gov (United States)

    Parenti, Marco Daniele; Rastelli, Giulio

    2012-01-01

    Nowadays, the improvement of R&D productivity is the primary commitment in pharmaceutical research, both in big pharma and smaller biotech companies. To reduce costs, to speed up the discovery process and to increase the chance of success, advanced methods of rational drug design are very helpful, as demonstrated by several successful applications. Among these, computational methods able to predict the binding affinity of small molecules to specific biological targets are of special interest because they can accelerate the discovery of new hit compounds. Here we provide an overview of the most widely used methods in the field of binding affinity prediction, as well as of our own work in developing BEAR, an innovative methodology specifically devised to overtake some limitations in existing approaches. The BEAR method was successfully validated against different biological targets, and proved its efficacy in retrieving active compounds from virtual screening campaigns. The results obtained so far indicate that BEAR may become a leading tool in the drug discovery pipeline. We primarily discuss advantages and drawbacks of each technique and show relevant examples and applications in drug discovery.

  17. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    Directory of Open Access Journals (Sweden)

    Khan Shafiq A

    2003-06-01

    Full Text Available Abstract Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells.

  18. Harvest: an open platform for developing web-based biomedical data discovery and reporting applications.

    Science.gov (United States)

    Pennington, Jeffrey W; Ruth, Byron; Italia, Michael J; Miller, Jeffrey; Wrazien, Stacey; Loutrel, Jennifer G; Crenshaw, E Bryan; White, Peter S

    2014-01-01

    Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu.

  19. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    Science.gov (United States)

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Application of next generation sequencing to human gene fusion detection: computational tools, features and perspectives.

    Science.gov (United States)

    Wang, Qingguo; Xia, Junfeng; Jia, Peilin; Pao, William; Zhao, Zhongming

    2013-07-01

    Gene fusions are important genomic events in human cancer because their fusion gene products can drive the development of cancer and thus are potential prognostic tools or therapeutic targets in anti-cancer treatment. Major advancements have been made in computational approaches for fusion gene discovery over the past 3 years due to improvements and widespread applications of high-throughput next generation sequencing (NGS) technologies. To identify fusions from NGS data, existing methods typically leverage the strengths of both sequencing technologies and computational strategies. In this article, we review the NGS and computational features of existing methods for fusion gene detection and suggest directions for future development.

  1. A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

    Directory of Open Access Journals (Sweden)

    Ginés D. Guerrero

    2014-01-01

    Full Text Available Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO. This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.

  2. Drug discovery applications for KNIME: an open source data mining platform.

    Science.gov (United States)

    Mazanetz, Michael P; Marmon, Robert J; Reisser, Catherine B T; Morao, Inaki

    2012-01-01

    Technological advances in high-throughput screening methods, combinatorial chemistry and the design of virtual libraries have evolved in the pursuit of challenging drug targets. Over the last two decades a vast amount of data has been generated within these fields and as a consequence data mining methods have been developed to extract key pieces of information from these large data pools. Much of this data is now available in the public domain. This has been helpful in the arena of drug discovery for both academic groups and for small to medium sized enterprises which previously would not have had access to such data resources. Commercial data mining software is sometimes prohibitively expensive and the alternate open source data mining software is gaining momentum in both academia and in industrial applications as the costs of research and development continue to rise. KNIME, the Konstanz Information Miner, has emerged as a leader in open source data mining tools. KNIME provides an integrated solution for the data mining requirements across the drug discovery pipeline through a visual assembly of data workflows drawing from an extensive repository of tools. This review will examine KNIME as an open source data mining tool and its applications in drug discovery.

  3. Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery

    CERN Document Server

    Zhang, Kun; Janzing, Dominik

    2012-01-01

    In nonlinear latent variable models or dynamic models, if we consider the latent variables as confounders (common causes), the noise dependencies imply further relations between the observed variables. Such models are then closely related to causal discovery in the presence of nonlinear confounders, which is a challenging problem. However, generally in such models the observation noise is assumed to be independent across data dimensions, and consequently the noise dependencies are ignored. In this paper we focus on the Gaussian process latent variable model (GPLVM), from which we develop an extended model called invariant GPLVM (IGPLVM), which can adapt to arbitrary noise covariances. With the Gaussian process prior put on a particular transformation of the latent nonlinear functions, instead of the original ones, the algorithm for IGPLVM involves almost the same computational loads as that for the original GPLVM. Besides its potential application in causal discovery, IGPLVM has the advantage that its estimat...

  4. Application of Glycoproteomics in the Discovery of Biomarkers for Lung Cancer

    Science.gov (United States)

    Li, Qing Kay; Gabrielson, Edward; Zhang, Hui

    2017-01-01

    Lung cancer is the leading cause of cancer-related deaths in the United States. Approximately 40–60% of lung cancer patients present with locally advanced or metastatic disease at the time of diagnosis. In order to improve the survival rate of lung cancer patients, the discovery of early diagnostic and prognostic biomarkers is urgently needed. Lung cancer development and progression are a multistep process which is characterized by abnormal gene and protein expressions ultimately leading to phenotypic change. In lung cancer, the expression of cellular glycoproteins directly reflects the physiological and/or pathological status of the lung parenchyma. Glycoproteins have long been recognized to play fundamental roles in many physiological and pathological processes, particularly in cancer genesis and progression. Although numerous papers have already acknowledged the importance of the discovery of cancer biomarkers, the systemic study of glycoproteins in lung cancer using glycoproteomic approaches is still suboptimal. Herein, we review the recent technological development of glycoproteomics in highlighting their utility and limitations for the discovery of glycoprotein biomarkers in lung cancer. PMID:22641610

  5. An ensemble method for gene discovery based on DNA microarray data

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    The advent of DNA microarray technology has offered the promise of casting new insights onto deciphering secrets of life by monitoring activities of thousands of genes simultaneously.Current analyses of microarray data focus on precise classification of biological types,for example,tumor versus normal tissues.A further scientific challenging task is to extract disease-relevant genes from the bewildering amounts of raw data,which is one of the most critical themes in the post-genomic era,but it is generally ignored due to lack of an efficient approach.In this paper,we present a novel ensemble method for gene extraction that can be tailored to fulfill multiple biological tasks including(i)precise classification of biological types;(ii)disease gene mining; and(iii)target-driven gene networking.We also give a numerical application for(i)and(ii)using a public microarrary data set and set aside a separate paper to address(iii).

  6. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    KAUST Repository

    Rungrat, Tepsuda

    2016-09-09

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana.

  7. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation.

    Science.gov (United States)

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F; Statnikov, Alexander

    2016-03-04

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods' performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost.

  8. Induced pluripotent stem cells: applications in regenerative medicine, disease modeling, and drug discovery.

    Science.gov (United States)

    Singh, Vimal K; Kalsan, Manisha; Kumar, Neeraj; Saini, Abhishek; Chandra, Ramesh

    2015-01-01

    such as animal models. Many toxic compounds (different chemical compounds, pharmaceutical drugs, other hazardous chemicals, or environmental conditions) which are encountered by humans and newly designed drugs may be evaluated for toxicity and effects by using iPSCs. Thus, the applications of iPSCs in regenerative medicine, disease modeling, and drug discovery are enormous and should be explored in a more comprehensive manner.

  9. On the Use of Social Networks in Web Services: Application to the Discovery Stage

    Science.gov (United States)

    Maamar, Zakaria; Wives, Leandro Krug; Boukadi, Khouloud

    This chapter discusses the use of social networks in Web services with focus on the discovery stage that characterizes the life cycle of these Web services. Other stages in this life cycle include description, publication, invocation, and composition. Web services are software applications that end users or other peers can invoke and compose to satisfy different needs such as hotel booking and car rental. Discovering the relevant Web services is, and continues to be, a major challenge due to the dynamic nature of these Web services. Indeed, Web services appear/disappear or suspend/resume operations without prior notice. Traditional discovery techniques are based on registries such as Universal Description, Discovery and Integration (UDDI) and Electronic Business using eXtensible Markup Language (ebXML). Unfortunately, despite the different improvements that these techniques have been subject to, they still suffer from various limitations that could slow down the acceptance trend of Web services by the IT community. Social networks seem to offer solutions to some of these limitations but raise, at the same time, some issues that are discussed in this chapter. The contributions of this chapter are three: social network definition in the particular context of Web services; mechanisms that support Web services build, use, and maintain their respective social networks; and social networks adoption to discover Web services.

  10. Managing Innovation to Maximize Value Along the Discovery-Translation-Application Continuum.

    Science.gov (United States)

    Waldman, S A; Terzic, A

    2017-01-01

    Success in pharmaceutical development led to a record 51 drugs approved in the past year, surpassing every previous year since 1950. Technology innovation enabled identification and exploitation of increasingly precise disease targets ensuring next generation diagnostic and therapeutic products for patient management. The expanding biopharmaceutical portfolio stands, however, in contradistinction to the unsustainable costs that reflect remarkable challenges of clinical development programs. This annual Therapeutic Innovations issue juxtaposes advances in translating molecular breakthroughs into transformative therapies with essential considerations for lowering attrition and improving the cost-effectiveness of the drug-development paradigm. Realizing the discovery-translation-application continuum mandates a congruent approval, adoption, and access triad. © 2016 ASCPT.

  11. Comparative GO: a web application for comparative gene ontology and gene ontology-based gene selection in bacteria.

    Directory of Open Access Journals (Sweden)

    Mario Fruzangohar

    Full Text Available The primary means of classifying new functions for genes and proteins relies on Gene Ontology (GO, which defines genes/proteins using a controlled vocabulary in terms of their Molecular Function, Biological Process and Cellular Component. The challenge is to present this information to researchers to compare and discover patterns in multiple datasets using visually comprehensible and user-friendly statistical reports. Importantly, while there are many GO resources available for eukaryotes, there are none suitable for simultaneous, graphical and statistical comparison between multiple datasets. In addition, none of them supports comprehensive resources for bacteria. By using Streptococcus pneumoniae as a model, we identified and collected GO resources including genes, proteins, taxonomy and GO relationships from NCBI, UniProt and GO organisations. Then, we designed database tables in PostgreSQL database server and developed a Java application to extract data from source files and loaded into database automatically. We developed a PHP web application based on Model-View-Control architecture, used a specific data structure as well as current and novel algorithms to estimate GO graphs parameters. We designed different navigation and visualization methods on the graphs and integrated these into graphical reports. This tool is particularly significant when comparing GO groups between multiple samples (including those of pathogenic bacteria from different sources simultaneously. Comparing GO protein distribution among up- or down-regulated genes from different samples can improve understanding of biological pathways, and mechanism(s of infection. It can also aid in the discovery of genes associated with specific function(s for investigation as a novel vaccine or therapeutic targets.http://turing.ersa.edu.au/BacteriaGO.

  12. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

    Directory of Open Access Journals (Sweden)

    Steinfeld Israel

    2009-02-01

    Full Text Available Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. Results GOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression. GOrilla employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, GOrilla computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms. Conclusion GOrilla is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. GOrilla's unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. GOrilla is publicly available at: http://cbl-gorilla.cs.technion.ac.il

  13. De novo assembly and characterization of the transcriptome of broomcorn millet (Panicum miliaceum L. for gene discovery and marker development

    Directory of Open Access Journals (Sweden)

    Hong Yue

    2016-07-01

    Full Text Available Broomcorn millet (Panicum miliaceum L. is one of the world’s oldest cultivated cereals, which is well adapted to extreme environments such as drought, heat and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No.2 and Yumi No.3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO categories. Cluster of orthologous groups and kyoto encyclopedia of genes and genomes (KEGG analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes (DEGs were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of 4 selected transcripts were validated through quantitative reverse transcription PCR (qRT-PCR analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond.

  14. Gene editing for cell engineering: trends and applications.

    Science.gov (United States)

    Gupta, Sanjeev K; Shukla, Pratyoosh

    2016-08-18

    Gene editing with all its own advantages in molecular biology applications has made easy manipulation of various production hosts with the discovery and implementation of modern gene editing tools such as Crispr (Clustered regularly interspaced short palindromic repeats), TALENs (Transcription activator-like effector nucleases) and ZFNs (Zinc finger nucleases). With the advent of these modern tools, it is now possible to manipulate the genome of industrial production hosts such as yeast and mammalian cells which allows developing a potential and cost effective recombinant therapeutic protein. These tools also allow single editing to multiple genes for knocking-in or knocking-out of a host genome quickly in an efficient manner. A recent study on "multiplexed" gene editing revolutionized the knock-out and knock-in events of yeast and CHO, mammalian cells genome for metabolic engineering as well as high, stable, and consistent expression of a transgene encoding complex therapeutic protein such as monoclonal antibody. The gene of interest can either be integrated or deleted at single or multiple loci depending on the strategy and production requirement. This review will give a gist of all the modern tools with a brief description and advances in genetic manipulation using three major tools being implemented for the modification of such hosts with the emphasis on the use of Crispr-Cas9 for the "multiplexing gene-editing approach" for genetic manipulation of yeast and CHO mammalian hosts that ultimately leads to a fast track product development with consistent, improved product yield, quality, and thus affordability for a population at large.

  15. Gene Therapy and its applications in Dentistry

    Directory of Open Access Journals (Sweden)

    Sharma Lakhanpal Manisha

    2006-01-01

    Full Text Available This era of advanced technology is marked by progress in identifying and understanding the molecular and cellular cause of a disease. With the conventional methods of treatment failing to render satisfactory results, gene therapy is not only being used for the cure of inherited diseases but also the acquired ones. The broad spectrum of gene therapy includes its application in the treatment of oral cancer and precancerous conditions and lesions, treatment of salivary gland diseases, bone repair, autoimmune diseases, DNA vaccination, etc. The aim of this article is to throw light on the history, methodology, applications and future of gene therapy as it would change the nature and face of dentistry in the coming years.

  16. Targeted proteomics by selected reaction monitoring mass spectrometry: applications to systems biology and biomarker discovery.

    Science.gov (United States)

    Elschenbroich, Sarah; Kislinger, Thomas

    2011-02-01

    Mass Spectrometry-based proteomics is now considered a relatively established strategy for protein analysis, ranging from global expression profiling to the identification of protein complexes and specific post-translational modifications. Recently, Selected Reaction Monitoring Mass Spectrometry (SRM-MS) has become increasingly popular in proteome research for the targeted quantification of proteins and post-translational modifications. Using triple quadrupole instrumentation (QqQ), specific analyte molecules are targeted in a data-directed mode. Used routinely for the quantitative analysis of small molecular compounds for at least three decades, the technology is now experiencing broadened application in the proteomics community. In the current review, we will provide a detailed summary of current developments in targeted proteomics, including some of the recent applications to biological research and biomarker discovery.

  17. Improved applications of the tetracycline-regulated gene depletion system.

    Science.gov (United States)

    Nishijima, Hitoshi; Yasunari, Takami; Nakayama, Tatsuo; Adachi, Noritaka; Shibahara, Kei-ichi

    2009-10-01

    Tightly controlled expression of transgenes in mammalian cells is an important tool for biological research, drug discovery, and future genetic therapies. The tetracycline-regulated gene depletion (Tet-Off) system has been widely used to control gene activities in mammalian cells, because it allows strict regulation of transgenes but no pleiotropic effects of prokaryotic regulatory proteins. However, the Tet-Off system is not compatible with every cell type and this is the main remaining obstacle left for this system. Recently, we overcame this problem by inserting an internal ribosome entry site (IRES) to drive a selectable marker from the same tetracycline-responsive promoter for the transgene. We also employed a CMV immediate early enhancer/beta-actin (CAG) promoter to express a Tet-controlled transactivator. Indeed, the Tet-Off system with these technical modifications was applied successfully to the human pre-B Nalm-6 cell line in which conventional Tet-Off systems had not worked efficiently. These methodological improvements should be applicable for many other mammalian proliferating cells. In this review we give an overview and introduce a new method for the improved application of the Tet-Off system.

  18. Trimetallic nitride template endohedral metallofullerenes: discovery, structural characterization, reactivity, and applications.

    Science.gov (United States)

    Zhang, Jianyuan; Stevenson, Steven; Dorn, Harry C

    2013-07-16

    Shortly after the discovery of the carbon fullerene allotrope, C₆₀, researchers recognized that the hollow spheroidal shape could accommodate metal atoms, or clusters, which quickly led to the discovery of endohedral metallofullerenes (EMFs). In the past 2 decades, the unique features of EMFs have attracted broad interest in many fields, including inorganic chemistry, organic chemistry, materials chemistry, and biomedical chemistry. Some EMFs produce new metallic clusters that do not exist outside of a fullerene cage, and some other EMFs can boost the efficiency of magnetic resonance (MR) imaging 10-50-fold, in comparison with commercial contrast agents. In 1999, the Dorn laboratory discovered the trimetallic nitride template (TNT) EMFs, which consist of a trimetallic nitride cluster and a host fullerene cage. The TNT-EMFs (A₃N@C2n, n = 34-55, A = Sc, Y, or lanthanides) are typically formed in relatively high yields (sometimes only exceeded by empty-cage C₆₀ and C₇₀, but yields may decrease with increasing TNT cluster size), and exhibit high chemical and thermal stability. In this Account, we give an overview of TNT-EMF research, starting with the discovery of these structures and then describing their synthesis and applications. First, we describe our serendipitous discovery of the first member of this class, Sc₃N@Ih-C₈₀. Second, we discuss the methodology for the synthesis of several TNT-EMFs. These results emphasize the importance of chemically adjusting plasma temperature, energy, and reactivity (CAPTEAR) to optimize the type and yield of TNT-EMFs produced. Third, we review the approaches that are used to separate and purify pristine TNT-EMF molecules from their corresponding product mixtures. Although we used high-performance liquid chromatography (HPLC) to separate TNT-EMFs in early studies, we have more recently achieved facile separation based on the reduced chemical reactivity of the TNT-EMFs. These improved production yields and

  19. Helping Students Understand Gene Regulation with Online Tools: A Review of MEME and Melina II, Motif Discovery Tools for Active Learning in Biology

    Directory of Open Access Journals (Sweden)

    David Treves

    2012-08-01

    Full Text Available Review of: MEME and Melina II, which are two free and easy-to-use online motif discovery tools that can be employed to actively engage students in learning about gene regulatory elements.

  20. Whole Genome Shotgun Sequences for Microsatellite Discovery and Application in Cultivated and Wild Macadamia (Proteaceae

    Directory of Open Access Journals (Sweden)

    Catherine J. Nock

    2014-03-01

    Full Text Available Premise of the study: Next-generation sequencing (NGS data are widely used for single-nucleotide polymorphism discovery and genetic marker development in species with limited available genome information. We developed microsatellite primers for the Proteaceae nut crop species Macadamia integrifolia and assessed cross-species transferability in all congeners to investigate genetic identification of cultivars and gene flow. Methods and Results: Primers were designed from both raw and assembled Illumina NGS paired-end reads. The final 12 microsatellite markers selected were polymorphic among wild individuals of all four Macadamia species—M. integrifolia, M. tetraphylla, M. ternifolia, and M. jansenii—and in commercial macadamia cultivars including hybrids. Conclusions: We demonstrate the utility of raw and assembled Illumina NGS reads from total genomic DNA for the rapid development of microsatellites in Macadamia. These primers will facilitate future studies of population structure, hybridization, parentage, and cultivar identification in cultivated and wild Macadamia populations.

  1. Arctic Research Mapping Application (ARMAP) Showcases discovery level metadata for US Funded Research Projects

    Science.gov (United States)

    Score, R.; Gaylord, A. G.; Kassin, A.; Cody, R. P.; Copenhaver, W.; Manley, W. F.; Dover, M.; Tweedie, C. E.

    2014-12-01

    The Arctic Research Mapping Application (ARMAP) is a suite of online applications and data services that support Arctic science by providing project tracking information (who's doing what, when and where in the region) for United States Government funded projects. Development of an interagency standard for tracking discovery level metadata for projects has been achieved through collaboration with the Alaska Data Integration work group. The US National Science Foundation plus 17 other agencies and organizations have adopted the standard with several entities successfully implementing XML based REST webservices. With ARMAP's web mapping applications and data services (http://armap.org), users can search for research projects by location, year, funding program, keyword, investigator, and discipline, among other variables. Key information about each project is displayed within the application with links to web pages that provide additional information. The ARMAP 2D mapping application has been significantly enhanced to include support for multiple projections, improved base maps, additional reference data layers, and optimization for better performance. In 2014, ship tracks for US National Science Foundation supported vessel based surveys have been expanded. These enhancements have been made to increase awareness of projects funded by numerous entities in the Arctic, enhance coordination for logistics support, help identify geographic gaps in research efforts and potentially foster more collaboration amongst researchers working in the region. Additionally, ARMAP can be used to demonstrate past, present, and future research efforts supported by the U.S. Government.

  2. Chronicles in drug discovery.

    Science.gov (United States)

    Davies, Shelley L; Moral, Maria Angels; Bozzo, Jordi

    2007-03-01

    Chronicles in Drug Discovery features special interest reports on advances in drug discovery. This month we highlight agents that target and deplete immunosuppressive regulatory T cells, which are produced by tumor cells to hinder innate immunity against, or chemotherapies targeting, tumor-associated antigens. Antiviral treatments for respiratory syncytial virus, a severe and prevalent infection in children, are limited due to their side effect profiles and cost. New strategies currently under clinical development include monoclonal antibodies, siRNAs, vaccines and oral small molecule inhibitors. Recent therapeutic lines for Huntington's disease include gene therapies that target the mutated human huntingtin gene or deliver neuroprotective growth factors and cellular transplantation in apoptotic regions of the brain. Finally, we highlight the antiinflammatory and antinociceptive properties of new compounds targeting the somatostatin receptor subtype sst4, which warrant further study for their potential application as clinical analgesics.

  3. Biomarker discovery and applications for foods and beverages: proteomics to nanoproteomics.

    Science.gov (United States)

    Agrawal, Ganesh Kumar; Timperio, Anna Maria; Zolla, Lello; Bansal, Vipul; Shukla, Ravi; Rakwal, Randeep

    2013-11-20

    Foods and beverages have been at the heart of our society for centuries, sustaining humankind - health, life, and the pleasures that go with it. The more we grow and develop as a civilization, the more we feel the need to know about the food we eat and beverages we drink. Moreover, with an ever increasing demand for food due to the growing human population food security remains a major concern. Food safety is another growing concern as the consumers prefer varied foods and beverages that are not only traded nationally but also globally. The 21st century science and technology is at a new high, especially in the field of biological sciences. The availability of genome sequences and associated high-throughput sensitive technologies means that foods are being analyzed at various levels. For example and in particular, high-throughput omics approaches are being applied to develop suitable biomarkers for foods and beverages and their applications in addressing quality, technology, authenticity, and safety issues. Proteomics are one of those technologies that are increasingly being utilized to profile expressed proteins in different foods and beverages. Acquired knowledge and protein information have now been translated to address safety of foods and beverages. Very recently, the power of proteomic technology has been integrated with another highly sensitive and miniaturized technology called nanotechnology, yielding a new term nanoproteomics. Nanoproteomics offer a real-time multiplexed analysis performed in a miniaturized assay, with low-sample consumption and high sensitivity. To name a few, nanomaterials - quantum dots, gold nanoparticles, carbon nanotubes, and nanowires - have demonstrated potential to overcome the challenges of sensitivity faced by proteomics for biomarker detection, discovery, and application. In this review, we will discuss the importance of biomarker discovery and applications for foods and beverages, the contribution of proteomic technology in

  4. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  5. A control study to evaluate a computer-based microarray experiment design recommendation system for gene-regulation pathways discovery.

    Science.gov (United States)

    Yoo, Changwon; Cooper, Gregory F; Schmidt, Martin

    2006-04-01

    The main topic of this paper is evaluating a system that uses the expected value of experimentation for discovering causal pathways in gene expression data. By experimentation we mean both interventions (e.g., a gene knock-out experiment) and observations (e.g., passively observing the expression level of a "wild-type" gene). We introduce a system called GEEVE (causal discovery in Gene Expression data using Expected Value of Experimentation), which implements expected value of experimentation in discovering causal pathways using gene expression data. GEEVE provides the following assistance, which is intended to help biologists in their quest to discover gene-regulation pathways: Recommending which experiments to perform (with a focus on "knock-out" experiments) using an expected value of experimentation (EVE) method. Recommending the number of measurements (observational and experimental) to include in the experimental design, again using an EVE method. Providing a Bayesian analysis that combines prior knowledge with the results of recent microarray experimental results to derive posterior probabilities of gene regulation relationships. In recommending which experiments to perform (and how many times to repeat them) the EVE approach considers the biologist's preferences for which genes to focus the discovery process. Also, since exact EVE calculations are exponential in time, GEEVE incorporates approximation methods. GEEVE is able to combine data from knock-out experiments with data from wild-type experiments to suggest additional experiments to perform and then to analyze the results of those microarray experimental results. It models the possibility that unmeasured (latent) variables may be responsible for some of the statistical associations among the expression levels of the genes under study. To evaluate the GEEVE system, we used a gene expression simulator to generate data from specified models of gene regulation. Using the simulator, we evaluated the GEEVE

  6. HiGate (High Grade Anti-Tamper Equipment Prototype and Application to e-Discovery

    Directory of Open Access Journals (Sweden)

    Yui Sakurai

    2010-06-01

    Full Text Available These days, most data is digitized and processed in various ways by computers. In the past, computer owners were free to process data as desired and to observe the inputted data as well as the interim results. However, the unrestricted processing of data and accessing of interim results even by computer users is associated with an increasing number of adverse events. These adverse events often occur when sensitive data such as personal or confidential business information must be handled by two or more parties, such as in the case of e-Discovery, used in legal proceedings, or epidemiologic studies. To solve this problem, providers encrypt data, and the owner of the computer performs decoding in the memory for encrypted data. The computer owner can be limited to performing only certain processing of data and to observing only the final results. As an implementation that uses existing technology to realize this solution, the processing of data contained in a smart card was considered, but such an implementation would not be practical due to issues related to computer capacity and processing speed. Accordingly, the authors present the concept of PC-based High Grade Anti-Tamper Equipment (HiGATE, which allows data to be handled without revealing the data content to administrators or users. To verify this concept, an e-Discovery application on a prototype was executed and the results are reported here.

  7. Gene Regulation, Modulation, and Their Applications in Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Mario Flores

    2013-01-01

    Full Text Available Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA, into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms.

  8. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    Science.gov (United States)

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  9. Mass spectrometry in biomarker applications: from untargeted discovery to targeted verification, and implications for platform convergence and clinical application

    Energy Technology Data Exchange (ETDEWEB)

    Smith, Richard D.

    2012-03-01

    It is really only in the last ten years that mass spectrometry (MS) has had a truly significant (but still small) impact on biomedical research. Much of this impact can be attributed to proteomics and its more basic applications. Early biomedical applications have included a number of efforts aimed at developing new biomarkers; however, the success of these endeavors to date have been quite modest - essentially confined to preclinical applications - and have often suffered from combinations of immature technology and hubris. Now that MS-based proteomics is reaching adolescence, it is appropriate to ask if and when biomarker-related applications will extend to the clinical realm, and what developments will be essential for this transition. Biomarker development can be described as a multistage process consisting of discovery, qualification, verification, research assay optimization, validation, and commercialization (1). From a MS perspective, it is possible to 'bin' measurements into 1 of 2 categories - those aimed at discovering potential protein biomarkers and those seeking to verify and validate biomarkers. Approaches in both categories generally involve digesting proteins (e.g., with trypsin) as a first step to yield peptides that can be effectively detected and identified with MS. Discovery-based approaches use broad 'unbiased' or 'undirected' measurements that attempt to cover as many proteins as possible in the hope of revealing promising biomarker candidates. A key challenge with this approach stems from the extremely large dynamic range (i.e., relative stoichiometry) of proteins of potential interest in biofluids such as plasma and the expectation that biomarker proteins of the greatest clinical value for many diseases may very well be present at low relative abundances (2). Protein concentrations in plasma extend from approximately 10{sup 10} pg/mL for albumin to approximately 10 pg/mL and below for interleukins and other

  10. STATE-OF-THE-ART HUMAN GENE THERAPY: PART II. GENE THERAPY STRATEGIES AND APPLICATIONS

    OpenAIRE

    2014-01-01

    In Part I of this Review, we introduced recent advances in gene delivery technologies and explained how they have powered some of the current human gene therapy applications. In Part II, we expand the discussion on gene therapy applications, focusing on some of the most exciting clinical uses. To help readers to grasp the essence and to better organize the diverse applications, we categorize them under four gene therapy strategies: (1) gene replacement therapy for monogenic diseases, (2) gene...

  11. Application of shotgun proteomics for discovery-driven protein-protein interaction.

    Science.gov (United States)

    Goto-Silva, Livia; Maliga, Zoltan; Slabicki, Mikolaj; Murillo, Jimmy Rodriguez; Junqueira, Magno

    2014-01-01

    Affinity purification of protein complexes and identification of co-purified proteins by mass spectrometry is a powerful method to discover novel protein-protein interactions. Application of this method to the study of biological systems often requires the ability to process a large number of samples. Hence, there is great need to generate proteomic workflows compatible with large-scale studies. The major goal of this protocol is to present a fast, reliable, and scalable method to characterize protein complexes by mass spectrometry to overcome the limitations of conventional geLC-MS/MS or MudPIT protocols. This method was successfully employed for the discovery and characterization of novel protein complexes in cultured yeast, mammalian cells, and mice.

  12. NeurphologyJ: An automatic neuronal morphology quantification method and its application in pharmacological discovery

    Directory of Open Access Journals (Sweden)

    Huang Hui-Ling

    2011-06-01

    Full Text Available Abstract Background Automatic quantification of neuronal morphology from images of fluorescence microscopy plays an increasingly important role in high-content screenings. However, there exist very few freeware tools and methods which provide automatic neuronal morphology quantification for pharmacological discovery. Results This study proposes an effective quantification method, called NeurphologyJ, capable of automatically quantifying neuronal morphologies such as soma number and size, neurite length, and neurite branching complexity (which is highly related to the numbers of attachment points and ending points. NeurphologyJ is implemented as a plugin to ImageJ, an open-source Java-based image processing and analysis platform. The high performance of NeurphologyJ arises mainly from an elegant image enhancement method. Consequently, some morphology operations of image processing can be efficiently applied. We evaluated NeurphologyJ by comparing it with both the computer-aided manual tracing method NeuronJ and an existing ImageJ-based plugin method NeuriteTracer. Our results reveal that NeurphologyJ is comparable to NeuronJ, that the coefficient correlation between the estimated neurite lengths is as high as 0.992. NeurphologyJ can accurately measure neurite length, soma number, neurite attachment points, and neurite ending points from a single image. Furthermore, the quantification result of nocodazole perturbation is consistent with its known inhibitory effect on neurite outgrowth. We were also able to calculate the IC50 of nocodazole using NeurphologyJ. This reveals that NeurphologyJ is effective enough to be utilized in applications of pharmacological discoveries. Conclusions This study proposes an automatic and fast neuronal quantification method NeurphologyJ. The ImageJ plugin with supports of batch processing is easily customized for dealing with high-content screening applications. The source codes of NeurphologyJ (interactive and high

  13. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Science.gov (United States)

    Kahle, Juliette J; Souroullas, George P; Yu, Peng; Zohren, Fabian; Lee, Yoontae; Shaw, Chad A; Zoghbi, Huda Y; Goodell, Margaret A

    2013-03-01

    Hematopoietic stem cells (HSCs) are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L) was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  14. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    Directory of Open Access Journals (Sweden)

    Juliette J Kahle

    2013-03-01

    Full Text Available Hematopoietic stem cells (HSCs are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  15. ConservedPrimers 2.0: a high-throughput pipeline for comparative genome referenced intron-flanking PCR primer design and its application in wheat SNP discovery.

    Science.gov (United States)

    You, Frank M; Huo, Naxin; Gu, Yong Q; Lazo, Gerard R; Dvorak, Jan; Anderson, Olin D

    2009-10-13

    In some genomic applications it is necessary to design large numbers of PCR primers in exons flanking one or several introns on the basis of orthologous gene sequences in related species. The primer pairs designed by this target gene approach are called "intron-flanking primers" or because they are located in exonic sequences which are usually conserved between related species, "conserved primers". They are useful for large-scale single nucleotide polymorphism (SNP) discovery and marker development, especially in species, such as wheat, for which a large number of ESTs are available but for which genome sequences and intron/exon boundaries are not available. To date, no suitable high-throughput tool is available for this purpose. We have developed, the ConservedPrimers 2.0 pipeline, for designing intron-flanking primers for large-scale SNP discovery and marker development, and demonstrated its utility in wheat. This tool uses non-redundant wheat EST sequences, such as wheat contigs and singleton ESTs, and related genomic sequences, such as those of rice, as inputs. It aligns the ESTs to the genomic sequences to identify unique colinear exon blocks and predicts intron lengths. Intron-flanking primers are then designed based on the intron/exon information using the Primer3 core program or BatchPrimer3. Finally, a tab-delimited file containing intron-flanking primer pair sequences and their primer properties is generated for primer ordering and their PCR applications. Using this tool, 1,922 bin-mapped wheat ESTs (31.8% of the 6,045 in total) were found to have unique colinear exon blocks suitable for primer design and 1,821 primer pairs were designed from these single- or low-copy genes for PCR amplification and SNP discovery. With these primers and subsequently designed genome-specific primers, a total of 1,527 loci were found to contain one or more genome-specific SNPs. The ConservedPrimers 2.0 pipeline for designing intron-flanking primers was developed and its

  16. Discovery and analysis of inflammatory disease-related genes using cDNA microarrays

    OpenAIRE

    1997-01-01

    cDNA microarray technology is used to profile complex diseases and discover novel disease-related genes. In inflammatory disease such as rheumatoid arthritis, expression patterns of diverse cell types contribute to the pathology. We have monitored gene expression in this disease state with a microarray of selected human genes of probable significance in inflammation as well as with genes expressed in peripheral human blood cells. Messenger RNA from cultured macrophages, chondrocyte cell lines...

  17. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products.

    Science.gov (United States)

    Gubbens, Jacob; Zhu, Hua; Girard, Geneviève; Song, Lijiang; Florea, Bogdan I; Aston, Philip; Ichinose, Koji; Filippov, Dmitri V; Choi, Young H; Overkleeft, Herman S; Challis, Gregory L; van Wezel, Gilles P

    2014-06-19

    Information on gene clusters for natural product biosynthesis is accumulating rapidly because of the current boom of available genome sequencing data. However, linking a natural product to a specific gene cluster remains challenging. Here, we present a widely applicable strategy for the identification of gene clusters for specific natural products, which we name natural product proteomining. The method is based on using fluctuating growth conditions that ensure differential biosynthesis of the bioactivity of interest. Subsequent combination of metabolomics and quantitative proteomics establishes correlations between abundance of natural products and concomitant changes in the protein pool, which allows identification of the relevant biosynthetic gene cluster. We used this approach to elucidate gene clusters for different natural products in Bacillus and Streptomyces, including a novel juglomycin-type antibiotic. Natural product proteomining does not require prior knowledge of the gene cluster or secondary metabolite and therefore represents a general strategy for identification of all types of gene clusters.

  18. Update of the Gene Discovery Program in Schistosoma mansoni with the Expressed Sequence Tag Approach

    Directory of Open Access Journals (Sweden)

    Élida ML Rabelo

    1997-09-01

    Full Text Available Continuing the Schistosoma mansoni Genome Project 363 new templates were sequenced generating 205 more ESTs corresponding to 91 genes. Seventy four of these genes (81% had not previously been described in S. mansoni. Among the newly discovered genes there are several of significant biological interest such as synaptophysin, NIFs-like and rho-GDP dissociation inhibitor

  19. Discovery of Novel NOx Catalysts for CIDI Applications by High-throughput Methods

    Energy Technology Data Exchange (ETDEWEB)

    Blint, Richard J. [General Motors Corporation, Warren, MI (United States)

    2007-12-31

    DOE project DE-PS26-00NT40758 has developed very active, lean exhaust, NOx reduction catalysts that have been tested on the discovery system, laboratory reactors and engine dynamometer systems. The goal of this project is the development of effective, affordable NOx reduction catalysts for lean combustion engines in the US light duty vehicle market which can meet Tier II emission standards with hydrocarbons based reductants for reducing NOx. General Motors (prime contractor) along with subcontractors BASF (Engelhard) (a catalytic converter developer) and ACCELRYS (an informatics supplier) carried out this project which began in August of 2002. BASF (Engelhard) has run over 16,000 tests of 6100 possible catalytic materials on a high throughput discovery system suitable for automotive catalytic materials. Accelrys developed a new database informatics system which allowed material tracking and data mining. A program catalyst was identified and evaluated at all levels of the program. Dynamometer evaluations of the program catalyst both with and without additives show 92% NOx conversions on the HWFET, 76% on the US06, 60% on the cold FTP and 65% on the Set 13 heavy duty test using diesel fuel. Conversions of over 92% on the heavy duty FTP using ethanol as a second fluid reductant have been measured. These can be competitive with both of the alternative lean NOx reduction technologies presently in the market. Conversions of about 80% were measured on the EUDC for lean gasoline applications without using active dosing to adjust the C:N ratio for optimum NOx reduction at all points in the certification cycle. A feasibility analysis has been completed and demonstrates the advantages and disadvantages of the technology using these materials compared with other potential technologies. The teaming agreements among the partners contain no obstacles to commercialization of new technologies to any potential catalyst customers.

  20. DEVELOPING GUIDED DISCOVERY LEARNING MATERIALS USING MATHEMATICS MOBILE LEARNING APPLICATION AS AN ALTERNATIVE MEDIA FOR THE STUDENTS CALCULUS II

    Directory of Open Access Journals (Sweden)

    Sunismi .

    2015-12-01

    Full Text Available Abstract: The development research aims to develop guided-discovery learning materials of Calculus II by implementing Mathematics Mobile Learning (MML. The products to develop are MML media of Calculus II using guided discovery model for students and a guide book for lecturers. The study employed used 4-D development model consisting of define, design, develop, and disseminate. The draft of the learning materials was validated by experts and tried-out to a group of students. The data were analyzed qualitatively and quantitatively by using a descriptive technique and t-test. The findings of the research were appropriate to be used ad teaching media for the students. The students responded positively that the MML media of Calculus II using the guided-discovery model was interestingly structured, easily operated through handphones (all JAVA, android, and blackberry-based handphones to be used as their learning guide anytime. The result of the field testing showed that the guided-discovery learning materials of Calculus II using the Mathematics Mobile Learning (MML application was effective to adopt in learning Calculus II. Keywords: learning materials, guided-discovery, mathematics mobile learning (MML, calculus II PENGEMBANGAN BAHAN AJAR MODEL GUIDED DISCOVERY DENGAN APLIKASI MATHEMATICS MOBILE LEARNING SEBAGAI ALTERNATIF MEDIA PEMBELAJARAN MAHASISWA MATAKULIAH KALKULUS II Abstrak: Penelitian pengembangan ini bertujuan untuk mengembangkan bahan ajar matakuliah Kalkulus II model guided discovery dengan aplikasi Mathematics Mobile Learning (MML. Produk yang dikembangkan berupa media MML Kalkulus II dengan model guided discovery untuk mahasiswa dan buku panduan dosen. Model pengembangan menggunakan 4-D yang meliputi tahap define, design, develop, dan dissemination. Draf bahan ajar divalidasi oleh pakar dan diujicobakan kepada sejumlah mahasiswa. Data dianalisis secara kualitatif dan kuantitatif dengan teknik deskriptif dan uji t. Temuan penelitian

  1. Discovery by the Epistasis Project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer's disease

    NARCIS (Netherlands)

    J.M. Bullock (James); C. Medway (Christopher); M. Cortina-Borja (Mario); J.C. Turton (James); J.A. Prince (Jonathan); C.A. Ibrahim-Verbaas (Carla); M. Schuur (Maaike); M.M.B. Breteler (Monique); C.M. van Duijn (Cock); P.G. Kehoe (Patrick); R. Barber (Rachel); E. Coto (Eliecer); V. Alvarez (Victoria); P. Deloukas (Panagiotis); N. Hammond (Naomi); O. Combarros (Onofre); I. Mateo (Ignacio); D.R. Warden (Donald); M.G. Lehmann (Michael); O. Belbin (Olivia); K. Brown (Kristelle); G.K. Wilcock (Gordon); R. Heun (Reinhard); H. Kölsch (Heike); A.D. Smith; D.J. Lehmann (Donald); K. Morgan (Kevin)

    2013-01-01

    textabstractDespite recent discoveries in the genetics of sporadic Alzheimer's disease, there remains substantial " hidden heritability." It is thought that some of this missing heritability may be because of gene-gene, i.e., epistatic, interactions. We examined potential epistasis between 110 candi

  2. Application in pesticide analysis: Liquid chromatography - A review of the state of science for biomarker discovery and identification

    Science.gov (United States)

    Book Chapter 18, titled Application in pesticide analysis: Liquid chromatography - A review of the state of science for biomarker discovery and identification, will be published in the book titled High Performance Liquid Chromatography in Pesticide Residue Analysis (Part of the C...

  3. Discovery of CTCF-sensitive Cis-spliced fusion RNAs between adjacent genes in human prostate cells.

    Directory of Open Access Journals (Sweden)

    Fujun Qin

    2015-02-01

    Full Text Available Genes or their encoded products are not expected to mingle with each other unless in some disease situations. In cancer, a frequent mechanism that can produce gene fusions is chromosomal rearrangement. However, recent discoveries of RNA trans-splicing and cis-splicing between adjacent genes (cis-SAGe support for other mechanisms in generating fusion RNAs. In our transcriptome analyses of 28 prostate normal and cancer samples, 30% fusion RNAs on average are the transcripts that contain exons belonging to same-strand neighboring genes. These fusion RNAs may be the products of cis-SAGe, which was previously thought to be rare. To validate this finding and to better understand the phenomenon, we used LNCaP, a prostate cell line as a model, and identified 16 additional cis-SAGe events by silencing transcription factor CTCF and paired-end RNA sequencing. About half of the fusions are expressed at a significant level compared to their parental genes. Silencing one of the in-frame fusions resulted in reduced cell motility. Most out-of-frame fusions are likely to function as non-coding RNAs. The majority of the 16 fusions are also detected in other prostate cell lines, as well as in the 14 clinical prostate normal and cancer pairs. By studying the features associated with these fusions, we developed a set of rules: 1 the parental genes are same-strand-neighboring genes; 2 the distance between the genes is within 30kb; 3 the 5' genes are actively transcribing; and 4 the chimeras tend to have the second-to-last exon in the 5' genes joined to the second exon in the 3' genes. We then randomly selected 20 neighboring genes in the genome, and detected four fusion events using these rules in prostate cancer and non-cancerous cells. These results suggest that splicing between neighboring gene transcripts is a rather frequent phenomenon, and it is not a feature unique to cancer cells.

  4. Live cell in vitro and in vivo imaging applications: accelerating drug discovery.

    Science.gov (United States)

    Isherwood, Beverley; Timpson, Paul; McGhee, Ewan J; Anderson, Kurt I; Canel, Marta; Serrels, Alan; Brunton, Valerie G; Carragher, Neil O

    2011-04-04

    Dynamic regulation of specific molecular processes and cellular phenotypes in live cell systems reveal unique insights into cell fate and drug pharmacology that are not gained from traditional fixed endpoint assays. Recent advances in microscopic imaging platform technology combined with the development of novel optical biosensors and sophisticated image analysis solutions have increased the scope of live cell imaging applications in drug discovery. We highlight recent literature examples where live cell imaging has uncovered novel insight into biological mechanism or drug mode-of-action. We survey distinct types of optical biosensors and associated analytical methods for monitoring molecular dynamics, in vitro and in vivo. We describe the recent expansion of live cell imaging into automated target validation and drug screening activities through the development of dedicated brightfield and fluorescence kinetic imaging platforms. We provide specific examples of how temporal profiling of phenotypic response signatures using such kinetic imaging platforms can increase the value of in vitro high-content screening. Finally, we offer a prospective view of how further application and development of live cell imaging technology and reagents can accelerate preclinical lead optimization cycles and enhance the in vitro to in vivo translation of drug candidates.

  5. GENE KICKED MOUSE: KNOCK OUT MOUSE AND ITS APPLICATION

    Directory of Open Access Journals (Sweden)

    Rajashekar B

    2013-07-01

    Full Text Available A knockout mouse is a laboratory mouse in which genes are inactivated, or "knocked out," an existing gene by replacing it or disrupting it with an artificial piece of DNA. The 2007 Nobel Prize in physiology or medicine is awarded to Drs Mario R. Capecchi, Martin J. Evans and Oliver Smithies for their discoveries of principles for introducing specific gene modifications in mice by using embryonic stem cells. Progress to gene targeting using embryonic cell was developed by Evans and his co-workers. Ingenious development of gene targeting has been made by introducing recognition sites for the enzyme Cre recombinase, called loxP sites, into existing genes. When mice carrying such "floxed" genes are mated with transgenic mice expressing Cre recombinase, the target gene of the offspring is modified through Cre action. Gene targeting has transformed scientific medicine by permitting experimental testing of hypotheses regarding the function of specific genes. The first area to which experimental geneticists turned their attention after the birth of gene targeting in mammals was monogenic diseases. Gene targeting has been exceptionally useful in cancer research. A large number of protooncogenes, tumor suppressor genes, angiogenetic factors etc have been targeted in different tissues in mice to shed light on the induction and spreading of tumours. Gene-targeted mouse models have also become increasingly important in studies of host defense against pathogens. Gene targeted mice have become indispensable in virtually all aspects of medical research.

  6. The web server of IBM's Bioinformatics and Pattern Discovery group

    OpenAIRE

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel,; Shibuya, Tetsuo

    2003-01-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic ...

  7. Discoveries and application of prostate-specific antigen, and some proposals to optimize prostate cancer screening

    Directory of Open Access Journals (Sweden)

    Tokudome S

    2016-05-01

    Full Text Available Shinkan Tokudome,1 Ryosuke Ando,2 Yoshiro Koda,3 1Department of Nutritional Epidemiology, National Institute of Health and Nutrition, Shinjuku-ku, Tokyo, 2Department of Nephro-urology, Nagoya City University Graduate School of Medical Sciences, Mizuho-ku, Nagoya, 3Department of Forensic Medicine and Human Genetics, Kurume University School of Medicine, Kurume, Japan Abstract: The discoveries and application of prostate-specific antigen (PSA have been much appreciated because PSA-based screening has saved millions of lives of prostate cancer (PCa patients. Historically speaking, Flocks et al first identified antigenic properties in prostate tissue in 1960. Then, Barnes et al detected immunologic characteristics in prostatic fluid in 1963. Hara et al characterized γ-semino-protein in semen in 1966, and it has been proven to be identical to PSA. Subsequently, Ablin et al independently reported the presence of precipitation antigens in the prostate in 1970. Wang et al purified the PSA in 1979, and Kuriyama et al first applied an enzyme-linked immunosorbent assay for PSA in 1980. However, the positive predictive value with a cutoff figure of 4.0 ng/mL appeared substantially low (~30%. There are overdiagnoses and overtreatments for latent/low-risk PCa. Controversies exist in the PCa mortality-reducing effects of PSA screening between the European Randomized Study of Screening for Prostate Cancer (ERSPC and the US Prostate, Lung, Colorectal, and Ovarian (PLCO Cancer Screening Trial. For optimizing PCa screening, PSA-related items may require the following: 1 adjustment of the cutoff values according to age, as well as setting limits to age and screening intervals; 2 improving test performance using doubling time, density, and ratio of free: total PSA; and 3 fostering active surveillance for low-risk PCa with monitoring by PSA value. Other items needing consideration may include the following: 1 examinations of cell proliferation and cell cycle markers

  8. Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression.

    Science.gov (United States)

    Kohannim, Omid; Hibar, Derrek P; Stein, Jason L; Jahanshad, Neda; Hua, Xue; Rajagopalan, Priya; Toga, Arthur W; Jack, Clifford R; Weiner, Michael W; de Zubicaray, Greig I; McMahon, Katie L; Hansell, Narelle K; Martin, Nicholas G; Wright, Margaret J; Thompson, Paul M

    2012-01-01

    We implemented least absolute shrinkage and selection operator (LASSO) regression to evaluate gene effects in genome-wide association studies (GWAS) of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4, and CDH13. The top genes we identified with this method also displayed significant and widespread post hoc effects on voxelwise, tensor-based morphometry (TBM) maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8 ± 2.2 SD years). Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain.

  9. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery(F).

    Science.gov (United States)

    Ono, Nadia Nicole; Britton, Monica Therese; Fass, Joseph Nathaniel; Nicolet, Charles Meyer; Lin, Dawei; Tian, Li

    2011-10-01

    Pomegranate fruit peel is rich in bioactive plant natural products, such as hydrolyzable tannins and anthocyanins. Despite their documented roles in human nutrition and fruit quality, genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain. Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform. Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp). Candidate genes for hydrolyzable tannin, anthocyanin, flavonoid, terpenoid and fatty acid biosynthesis and/or regulation were identified. Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts. In addition, 115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers. The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate. This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis, identifying genes controlling important agronomic traits, and discovering molecular markers in non-model specialty crop species.

  10. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery

    Institute of Scientific and Technical Information of China (English)

    Nadia Nicole Ono; Monica Therese Britton; Joseph Nathaniel Fass; Charles Meyer Nicolet; Dawei Lin; Li Tian

    2011-01-01

    Pomegranate fruit peel is rich in bioactive plant natural products,such as hydrolyzable tannins and anthocyanins.Despite their documented roles in human nutrition and fruit quality,genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain.Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform.Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp).Candidate genes for hydrolyzable tannin,anthocyanin,flavonoid,terpenoid and fatty acid biosynthesis and/or regulation were identified.Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts.In addition,115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers.The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate.This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis,identifying genes controlling important agronomic traits,and discovering molecular markers in non-model specialty crop species.

  11. ETS gene fusions in prostate cancer: from discovery to daily clinical practice.

    NARCIS (Netherlands)

    Tomlins, S.A.; Bjartell, A.; Chinnaiyan, A.M.; Jenster, G.; Nam, R.K.; Rubin, M.A.; Schalken, J.A.

    2009-01-01

    CONTEXT: In 2005, fusions between the androgen-regulated transmembrane protease serine 2 gene, TMPRSS2, and E twenty-six (ETS) transcription factors were discovered in prostate cancer. OBJECTIVE: To review advances in our understanding of ETS gene fusions, focusing on challenges affecting

  12. ETS gene fusions in prostate cancer: from discovery to daily clinical practice.

    NARCIS (Netherlands)

    Tomlins, S.A.; Bjartell, A.; Chinnaiyan, A.M.; Jenster, G.; Nam, R.K.; Rubin, M.A.; Schalken, J.A.

    2009-01-01

    CONTEXT: In 2005, fusions between the androgen-regulated transmembrane protease serine 2 gene, TMPRSS2, and E twenty-six (ETS) transcription factors were discovered in prostate cancer. OBJECTIVE: To review advances in our understanding of ETS gene fusions, focusing on challenges affecting translatio

  13. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    Science.gov (United States)

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q

    2016-09-02

    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  14. Correlating overrepresented upstream motifs to gene expression a computational approach to regulatory element discovery in eukaryotes

    CERN Document Server

    Caselle, M; Provero, P

    2002-01-01

    Gene regulation in eukaryotes is mainly effected through transcription factors binding to rather short recognition motifs generally located upstream of the coding region. We present a novel computational method to identify regulatory elements in the upstream region of eukaryotic genes. The genes are grouped in sets sharing an overrepresented short motif in their upstream sequence. For each set, the average expression level from a microarray experiment is determined: If this level is significantly higher or lower than the average taken over the whole genome, then the overerpresented motif shared by the genes in the set is likely to play a role in their regulation. The method was tested by applying it to the genome of Saccharomyces cerevisiae, using the publicly available results of a DNA microarray experiment, in which expression levels for virtually all the genes were measured during the diauxic shift from fermentation to respiration. Several known motifs were correctly identified, and a new candidate regulat...

  15. Network-Guided Key Gene Discovery for a Given Cellular Process

    DEFF Research Database (Denmark)

    He, Feng Q; Ollert, Markus

    2017-01-01

    Identification of key genes for a given physiological or pathological process is an essential but still very challenging task for the entire biomedical research community. Statistics-based approaches, such as genome-wide association study (GWAS)- or quantitative trait locus (QTL)-related analysis...... have already made enormous contributions to identifying key genes associated with a given disease or phenotype, the success of which is however very much dependent on a huge number of samples. Recent advances in network biology, especially network inference directly from genome-scale data...... and the following-up network analysis, opens up new avenues to predict key genes driving a given biological process or cellular function. Here we review and compare the current approaches in predicting key genes, which have no chances to stand out by classic differential expression analysis, from gene...

  16. Discovery of mitochondrial chimeric-gene associated with cytoplasmic male sterility of HL-rice

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The mitochondrial genome libraries of HL-type sterile line(A) and maintainer line(B) have been constructed.Mitochondrial gene, atp6, was used to screen libraries, due to the different Southern and Northern blot results between sterile and maintainer line. Sequencing analysis of positive clones proved that there were two copies of atp6 gene in sterile line and only one in maintainer line. One copy of atpt6 in sterile line was same to that in maintainer line; the other showed different flanking sequence from the 49th nucleotide downstream of the termination codon of atp6 gene. A new chimeric gene, orfH79, was found in the region. OrfH79 had homology to mitochondrial gene coxⅡ and orfl07, and was special to HL-sterile cytoplasm.``

  17. Cancer driver gene discovery through an integrative genomics approach in a non-parametric Bayesian framework.

    Science.gov (United States)

    Yang, Hai; Wei, Qiang; Zhong, Xue; Yang, Hushan; Li, Bingshan

    2017-02-15

    Comprehensive catalogue of genes that drive tumor initiation and progression in cancer is key to advancing diagnostics, therapeutics and treatment. Given the complexity of cancer, the catalogue is far from complete yet. Increasing evidence shows that driver genes exhibit consistent aberration patterns across multiple-omics in tumors. In this study, we aim to leverage complementary information encoded in each of the omics data to identify novel driver genes through an integrative framework. Specifically, we integrated mutations, gene expression, DNA copy numbers, DNA methylation and protein abundance, all available in The Cancer Genome Atlas (TCGA) and developed iDriver, a non-parametric Bayesian framework based on multivariate statistical modeling to identify driver genes in an unsupervised fashion. iDriver captures the inherent clusters of gene aberrations and constructs the background distribution that is used to assess and calibrate the confidence of driver genes identified through multi-dimensional genomic data. We applied the method to 4 cancer types in TCGA and identified candidate driver genes that are highly enriched with known drivers. (e.g.: P < 3.40 × 10 -36 for breast cancer). We are particularly interested in novel genes and observed multiple lines of supporting evidence. Using systematic evaluation from multiple independent aspects, we identified 45 candidate driver genes that were not previously known across these 4 cancer types. The finding has important implications that integrating additional genomic data with multivariate statistics can help identify cancer drivers and guide the next stage of cancer genomics research. The C ++ source code is freely available at https://medschool.vanderbilt.edu/cgg/ . hai.yang@vanderbilt.edu or bingshan.li@Vanderbilt.Edu. Supplementary data are available at Bioinformatics online.

  18. Current Applications of Liquid Chromatography/Mass Spectrometry in Pharmaceutical Discovery After a Decade of Innovation

    Science.gov (United States)

    Ackermann, Bradley L.; Berna, Michael J.; Eckstein, James A.; Ott, Lee W.; Chaudhary, Ajai K.

    2008-07-01

    Current drug discovery involves a highly iterative process pertaining to three core disciplines: biology, chemistry, and drug disposition. For most pharmaceutical companies the path to a drug candidate comprises similar stages: target identification, biological screening, lead generation, lead optimization, and candidate selection. Over the past decade, the overall efficiency of drug discovery has been greatly improved by a single instrumental technique, liquid chromatography/mass spectrometry (LC/MS). Transformed by the commercial introduction of the atmospheric pressure ionization interface in the mid-1990s, LC/MS has expanded into almost every area of drug discovery. In many cases, drug discovery workflow has been changed owing to vastly improved efficiency. This review examines recent trends for these three core disciplines and presents seminal examples where LC/MS has altered the current approach to drug discovery.

  19. Discovery of putative capsaicin biosynthetic genes by RNA-Seq and digital gene expression analysis of pepper

    Science.gov (United States)

    Zhang, Zi-Xin; Zhao, Shu-Niu; Liu, Gao-Feng; Huang, Zu-Mei; Cao, Zhen-Mu; Cheng, Shan-Han; Lin, Shi-Sen

    2016-01-01

    The Indian pepper ‘Guijiangwang’ (Capsicum frutescens L.), one of the world’s hottest chili peppers, is rich in capsaicinoids. The accumulation of the alkaloid capsaicin and its analogs in the epidermal cells of the placenta contribute to the pungency of Capsicum fruits. To identify putative genes involved in capsaicin biosynthesis, RNA-Seq was used to analyze the pepper’s expression profiles over five developmental stages. Five cDNA libraries were constructed from the total RNA of placental tissue and sequenced using an Illumina HiSeq 2000. More than 19 million clean reads were obtained from each library, and greater than 50% of the reads were assignable to reference genes. Digital gene expression (DGE) profile analysis using Solexa sequencing was performed at five fruit developmental stages and resulted in the identification of 135 genes of known function; their expression patterns were compared to the capsaicin accumulation pattern. Ten genes of known function were identified as most likely to be involved in regulating capsaicin synthesis. Additionally, 20 new candidate genes were identified related to capsaicin synthesis. We use a combination of RNA-Seq and DGE analyses to contribute to the understanding of the biosynthetic regulatory mechanism(s) of secondary metabolites in a nonmodel plant and to identify candidate enzyme-encoding genes. PMID:27756914

  20. A horizontal alignment tool for numerical trend discovery in sequence data: application to protein hydropathy.

    Directory of Open Access Journals (Sweden)

    Omar Hadzipasic

    Full Text Available An algorithm is presented that returns the optimal pairwise gapped alignment of two sets of signed numerical sequence values. One distinguishing feature of this algorithm is a flexible comparison engine (based on both relative shape and absolute similarity measures that does not rely on explicit gap penalties. Additionally, an empirical probability model is developed to estimate the significance of the returned alignment with respect to randomized data. The algorithm's utility for biological hypothesis formulation is demonstrated with test cases including database search and pairwise alignment of protein hydropathy. However, the algorithm and probability model could possibly be extended to accommodate other diverse types of protein or nucleic acid data, including positional thermodynamic stability and mRNA translation efficiency. The algorithm requires only numerical values as input and will readily compare data other than protein hydropathy. The tool is therefore expected to complement, rather than replace, existing sequence and structure based tools and may inform medical discovery, as exemplified by proposed similarity between a chlamydial ORFan protein and bacterial colicin pore-forming domain. The source code, documentation, and a basic web-server application are available.

  1. Optical oxygen sensing systems for drug discovery applications: Respirometric Screening Technology (RST)

    Science.gov (United States)

    Papkovsky, Dmitri B.; Hynes, James; Fernandes, Richard

    2005-11-01

    Quenched-fluorescence oxygen sensing allows non-chemical, reversible, real-time monitoring of molecular oxygen and rates of oxygen consumption in biological samples. Using this approach we have developed Respirometric Screening Technology (RST); a platform which facilitates the convenient analysis of cellular oxygen uptake. This in turn allows the investigation of compounds and processes which affect respiratory activity. The RST platform employs soluble phosphorescent oxygen-sensitive probes, which may be assessed in standard microtitter plates on a fluorescence plate reader. New formats of RST assays and time-resolved fluorescence detection instrumentation developed by Luxcel provide improvements in assay sensitivity, miniaturization and overall performance. RST has a diverse range of applications in drug discovery area including high throughput analysis of mitochondrial function; studies of mechanisms of toxicity and apoptosis; cell and animal based screening of compound libraries and environmental samples; and, sterility testing. RST has been successfully validated with a range of practical targets and adopted by several leading pharmaceutical companies.

  2. Gene discovery in the freshwater fish parasite Trypanosoma carassii: identification of trans-sialidase-like and mucin-like genes.

    Science.gov (United States)

    Agüero, Fernán; Campo, Vanina; Cremona, Laura; Jäger, Adriana; Di Noia, Javier M; Overath, Peter; Sánchez, Daniel O; Frasch, Alberto Carlos

    2002-12-01

    A total of 1,921 expressed sequence tags (ESTs) were obtained from bloodstream trypomastigotes of Trypanosoma carassii, a parasite of economic importance due to its high prevalence in fish farms. Analysis of the data set allowed us to identify a trans-sialidase (TS)-like gene and three ESTs coding for putative mucin-like genes. TS activity was detected in cell extracts of bloodstream trypomastigotes. We have also used the sequence information obtained to identify genes that have not been previously described in trypanosomatids. (Additional information on these ESTs can be found at http://genoma.unsam.edu.ar/projects/tca.)

  3. Molecular profiling of breast cancer cell lines defines relevant tumor models and provides a resource for cancer gene discovery.

    Directory of Open Access Journals (Sweden)

    Jessica Kao

    novel candidate breast cancer genes. CONCLUSIONS: Overall, breast cancer cell lines were genetically more complex than tumors, but retained expression patterns with relevance to the luminal-basal subtype distinction. The compendium of molecular profiles defines cell lines suitable for investigations of subtype-specific pathobiology, cancer stem cell biology, biomarkers and therapies, and provides a resource for discovery of new breast cancer genes.

  4. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    Directory of Open Access Journals (Sweden)

    Robertson Gordon

    2010-10-01

    Full Text Available Abstract Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE. Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.

  5. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    Science.gov (United States)

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence.

  6. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  7. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  8. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    OpenAIRE

    Chen, Yang; Xu, Rong

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cros...

  9. Pattern Discovery using Fuzzy FP-growth Algorithm from Gene Expression Data

    OpenAIRE

    Sabita Barik; Debahuti Mishra; Shruti Mishra; Sandeep Ku. Satapathy; Amiya Ku. Rath; Milu Acharya

    2010-01-01

    Abstract- The goal of microarray experiments is to identify genes that are differentially transcribed with respect to different biological conditions of cell cultures and samples. Hence, method of data analysis needs to be carefully evaluated such as clustering, classification, prediction etc. In this paper, we have proposed an efficient frequent pattern based clustering to find the gene which forms frequent patterns showing similar phenotypes leading to specific symptoms for specific disease...

  10. Genome-wide discovery of Pax7 target genes during development.

    Science.gov (United States)

    White, Robert B; Ziman, Melanie R

    2008-03-14

    Pax7 plays critical roles in development of brain, spinal cord, neural crest, and skeletal muscle. As a sequence-specific DNA-binding transcription factor, any direct functional role played by Pax7 during development is mediated through target gene selection. Thus, we have sought to identify genes targeted by Pax7 during embryonic development using an unbiased chromatin immunoprecipitation (ChIP) cloning assay to isolate cis-regulatory regions bound by Pax7 in vivo. Sequencing and genomic localization of a library of chromatin-DNA fragments bound by Pax7 has identified 34 candidate Pax7 target genes, with occupancy of a selection confirmed with independent chromatin enrichment tests (ChIP-PCR). To assess the capacity of Pax7 to regulate transcription from these loci, we have cloned alternate transcripts of Pax7 (differing significantly in their DNA binding domain) into expression vectors and transfected cultured cells with these constructs, then analyzed target gene expression levels using RT-PCR. We show that Pax7 directly occupies sites within genes encoding transcription factors Gbx1 and Eya4, the neurogenic cytokine receptor ciliary neurotrophic factor receptor, the neuronal potassium channel Kcnk2, and the signal transduction kinase Camk1d in vivo and regulates the transcriptional state of these genes in cultured cells. This analysis gives us greater insight into the direct functional role played by Pax7 during embryonic development.

  11. Discovery and characterization of novel vascular and hematopoietic genes downstream of etsrp in zebrafish.

    Directory of Open Access Journals (Sweden)

    Gustavo A Gomez

    Full Text Available The transcription factor Etsrp is required for vasculogenesis and primitive myelopoiesis in zebrafish. When ectopically expressed, etsrp is sufficient to induce the expression of many vascular and myeloid genes in zebrafish. The mammalian homolog of etsrp, ER71/Etv2, is also essential for vascular and hematopoietic development. To identify genes downstream of etsrp, gain-of-function experiments were performed for etsrp in zebrafish embryos followed by transcription profile analysis by microarray. Subsequent in vivo expression studies resulted in the identification of fourteen genes with blood and/or vascular expression, six of these being completely novel. Regulation of these genes by etsrp was confirmed by ectopic induction in etsrp overexpressing embryos and decreased expression in etsrp deficient embryos. Additional functional analysis of two newly discovered genes, hapln1b and sh3gl3, demonstrates their importance in embryonic vascular development. The results described here identify a group of genes downstream of etsrp likely to be critical for vascular and/or myeloid development.

  12. Discovery of antibiotics-derived polymers for gene delivery using combinatorial synthesis and cheminformatics modeling.

    Science.gov (United States)

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D; Ramos, James; Breneman, Curt M; Rege, Kaushal

    2014-02-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and 'building block' polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. Natural and man-made V-gene repertoires for antibody discovery.

    Science.gov (United States)

    Finlay, William J J; Almagro, Juan C

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process.

  14. Current parallel chemistry principles and practice: application to the discovery of biologically active molecules.

    Science.gov (United States)

    Edwards, Paul J

    2009-11-01

    This article describes the use of parallel chemistry techniques for drug discovery, based on publications from January 2006 to December 2008. Chemical libraries that yielded active compounds across a range of biological targets are presented, together with synthetic details when appropriate. Background information for the biological targets involved and any SAR that could be discerned within members of a library series also is discussed. New technological developments, as applied to library design and synthesis and, more generally, in the discovery of biologically active entities, are highlighted. In addition, the likely future directions for parallel chemistry in its ability to impact upon drug discovery are also presented.

  15. Gene discovery at the human T-cell receptor alpha/delta locus.

    Science.gov (United States)

    Haynes, Marsha R; Wu, Gillian E

    2007-02-01

    The human T-cell receptor (TCR) alpha/delta variable loci are interspersed on the chromosome 14q11 and consist of 57 intergenic spaces ranging from 4 to 100 kb in length. To elucidate the evolutionary history of this locus, we searched the intergenic spaces of all TCR alpha/delta variable (TRAV/DV) genes for pseudogenes and potential protein-coding genes. We applied direct open reading frame (ORF) searches, an exon-finding algorithm and comparative genomics. Two TRAV/DV pseudogenes were discovered bearing 80 and 65% sequence similarity to TRAV14DV4 and TRAV9-1/9-2 genes, respectively. A gene bearing 85% sequence identity to B lymphocyte activation-related protein, BC-1514, upstream of TRAV26-2 was also discovered. This ORF (BC-1514tcra) is a member of a gene family whose evolutionary history and function are not known. In total, 36 analogs of this gene exist in the human, the chimpanzee, the Rhesus monkey, the frog and the zebrafish. Phylogenetic analyses show convergent evolution of these genes. Assays for the expression of BC-1514tcra revealed transcripts in the bone marrow, thymus, spleen, and small intestine. These assays also showed the expression of another analog to BC-1514, found on chromosome 5 in the bone marrow and thymus RNA. The existence of at least 17 analogs at various locations in the human genome and in nonsyntenic chromosomes of the chimpanzee suggest that BC-1514tcra, along with its analogs may be transposable elements with evolved function(s). The identification of conserved putative serine phosphorylation sites provide evidence of their possible role(s) in signal transduction events involved in B cell development and differentiation.

  16. Transcriptome analysis and discovery of genes involved in immune pathways from hepatopancreas of microbial challenged mitten crab Eriocheir sinensis.

    Directory of Open Access Journals (Sweden)

    Xihong Li

    Full Text Available BACKGROUND: The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq technology provides a powerful and efficient method for transcript analysis and immune gene discovery. METHODS/PRINCIPAL FINDINGS: A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 10(8 cfu·mL(-1 was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr database. For function classification and pathway assignment, 18,734 (36.00% unigenes were categorized to three Gene Ontology (GO categories, 12,243 (23.51% were classified to 25 Clusters of Orthologous Groups (COG, and 8,983 (17.25% were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. CONCLUSIONS/SIGNIFICANCE: This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab.

  17. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data

    Directory of Open Access Journals (Sweden)

    Li Min

    2012-03-01

    Full Text Available Abstract Background Identification of essential proteins is always a challenging task since it requires experimental approaches that are time-consuming and laborious. With the advances in high throughput technologies, a large number of protein-protein interactions are available, which have produced unprecedented opportunities for detecting proteins' essentialities from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. However, the network topology-based centrality measures are very sensitive to the robustness of network. Therefore, a new robust essential protein discovery method would be of great value. Results In this paper, we propose a new centrality measure, named PeC, based on the integration of protein-protein interaction and gene expression data. The performance of PeC is validated based on the protein-protein interaction network of Saccharomyces cerevisiae. The experimental results show that the predicted precision of PeC clearly exceeds that of the other fifteen previously proposed centrality measures: Degree Centrality (DC, Betweenness Centrality (BC, Closeness Centrality (CC, Subgraph Centrality (SC, Eigenvector Centrality (EC, Information Centrality (IC, Bottle Neck (BN, Density of Maximum Neighborhood Component (DMNC, Local Average Connectivity-based method (LAC, Sum of ECC (SoECC, Range-Limited Centrality (RL, L-index (LI, Leader Rank (LR, Normalized α-Centrality (NC, and Moduland-Centrality (MC. Especially, the improvement of PeC over the classic centrality measures (BC, CC, SC, EC, and BN is more than 50% when predicting no more than 500 proteins. Conclusions We demonstrate that the integration of protein-protein interaction network and gene expression data can help improve the precision of predicting essential proteins. The new centrality measure, PeC, is an effective essential protein discovery method.

  18. Discovery of Phytophthora infestans genes expressed in planta through mining of cDNA libraries.

    Directory of Open Access Journals (Sweden)

    Roberto Sierra

    Full Text Available BACKGROUND: Phytophthora infestans (Mont. de Bary causes late blight of potato and tomato, and has a broad host range within the Solanaceae family. Most studies of the Phytophthora--Solanum pathosystem have focused on gene expression in the host and have not analyzed pathogen gene expression in planta. METHODOLOGY/PRINCIPAL FINDINGS: We describe in detail an in silico approach to mine ESTs from inoculated host plants deposited in a database in order to identify particular pathogen sequences associated with disease. We identified candidate effector genes through mining of 22,795 ESTs corresponding to P. infestans cDNA libraries in compatible and incompatible interactions with hosts from the Solanaceae family. CONCLUSIONS/SIGNIFICANCE: We annotated genes of P. infestans expressed in planta associated with late blight using different approaches and assigned putative functions to 373 out of the 501 sequences found in the P. infestans genome draft, including putative secreted proteins, domains associated with pathogenicity and poorly characterized proteins ideal for further experimental studies. Our study provides a methodology for analyzing cDNA libraries and provides an understanding of the plant--oomycete pathosystems that is independent of the host, condition, or type of sample by identifying genes of the pathogen expressed in planta.

  19. Ten years' venturing in ZnO nanostructures: from discovery to scientific understanding and to technology applications

    Institute of Scientific and Technical Information of China (English)

    Zhong Lin WANG

    2009-01-01

    Zinc oxide is a unique material that exhibits semiconducting,piezoelectric and pyroelectric multiple properties.Nanostructures of ZnO are equally important as carbon nanotubes and silicon nanowires (NWs) for nanotechnology,and have great potential applications in nano-electronics,optoelectronics,sensors,field emission,light emitting diodes,photocatalysis,nanogenerators,and nanopiezotronics.Ever since the discovery of nanobelts (NBs) in 2001 by my group,a world wide research in ZnO has been kicked off.This review introduces my group's experience in venturing the discovery,understanding and applications of ZnO NWs and NBs.The aim is to introduce the progress made in my research in the last 10 years in accompany to the huge social advances and economic development taking place in China in the last 10 years.

  20. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    Science.gov (United States)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  1. Discovery and identification of candidate genes from the chitinase gene family for Verticillium dahliae resistance in cotton.

    Science.gov (United States)

    Xu, Jun; Xu, Xiaoyang; Tian, Liangliang; Wang, Guilin; Zhang, Xueying; Wang, Xinyu; Guo, Wangzhen

    2016-06-29

    Verticillium dahliae, a destructive and soil-borne fungal pathogen, causes massive losses in cotton yields. However, the resistance mechanism to V. dahilae in cotton is still poorly understood. Accumulating evidence indicates that chitinases are crucial hydrolytic enzymes, which attack fungal pathogens by catalyzing the fungal cell wall degradation. As a large gene family, to date, the chitinase genes (Chis) have not been systematically analyzed and effectively utilized in cotton. Here, we identified 47, 49, 92, and 116 Chis from four sequenced cotton species, diploid Gossypium raimondii (D5), G. arboreum (A2), tetraploid G. hirsutum acc. TM-1 (AD1), and G. barbadense acc. 3-79 (AD2), respectively. The orthologous genes were not one-to-one correspondence in the diploid and tetraploid cotton species, implying changes in the number of Chis in different cotton species during the evolution of Gossypium. Phylogenetic classification indicated that these Chis could be classified into six groups, with distinguishable structural characteristics. The expression patterns of Chis indicated their various expressions in different organs and tissues, and in the V. dahliae response. Silencing of Chi23, Chi32, or Chi47 in cotton significantly impaired the resistance to V. dahliae, suggesting these genes might act as positive regulators in disease resistance to V. dahliae.

  2. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses.

    Science.gov (United States)

    Jiao, Yinping; Burke, John; Chopra, Ratan; Burow, Gloria; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves; Ware, Doreen; Xin, Zhanguo

    2016-07-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches.

  3. Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing

    NARCIS (Netherlands)

    I. Jansen (Iris); Ye, H. (Hui); Heetveld, S. (Sasja); Lechler, M.C. (Marie C.); Michels, H. (Helen); Seinstra, R.I. (Renée I.); Lubbe, S.J. (Steven J.); Drouet, V. (Valérie); S. Lesage (Suzanne); E. Majounie (Elisa); Gibbs, J.R. (J.Raphael); M.A. Nalls (Michael); M. Ryten (Mina); Botia, J.A. (Juan A.); J. Vandrovcova (Jana); J. Simón-Sánchez (Javier); Castillo-Lizardo, M. (Melissa); P. Rizzu (Patrizia); Blauwendraat, C. (Cornelis); Chouhan, A.K. (Amit K.); Li, Y. (Yarong); Yogi, P. (Puja); N. Amin (Najaf); C.M. van Duijn (Cock); Morris, H.R. (Huw R.); Brice, A. (Alexis); A. Singleton (Andrew); David, D.C. (Della C.); Nollen, E.A. (Ellen A.); A. Jain (Ashok); J.M. Shulman; P. Heutink (Peter); D.G. Hernandez (Dena); S. Arepalli (Sampath); J. Brooks (Janet); Price, R. (Ryan); Nicolas, A. (Aude); S. Chong (Sean); M.R. Cookson (Mark); A. Dillman (Allissa); M. Moore (Matt); B.J. Traynor (Bryan); A. Singleton (Andrew); V. Plagnol (Vincent); Nicholas W Wood,; U.-M. Sheerin (Una-Marie); Jose M Bras,; K. Charlesworth (Kate); M. Gardner (Mac); R. Guerreiro (Rita); D. Trabzuni (Danyah); Hardy, J. (John); M. Sharma; M. Saad (Mohamad); Javier Simón-Sánchez,; C. Schulte (Claudia); J.C. Corvol (Jean-Christophe); Dürr, A. (Alexandra); M. Vidailhet (M.); S. Sveinbjörnsdóttir (Sigurlaug); R.A. Barker (Roger); Caroline H Williams-Gray,; Y. Ben-Shlomo; H.W. Berendse (Henk W.); K.D. van Dijk (Karin); D. Berg (Daniela); K. Brockmann; K.D. Wurster (Kathrin); Mätzler, W. (Walter); Gasser, T. (Thomas); M. Martinez (Maria); R.M.A. de Bie (Rob); A. Biffi (Alessandro); D. Velseboer (Daan); B.R. Bloem (Bastiaan); B. Post (Bart); M. Wickremaratchi (Mirdhu); B. van de Warrenburg (Bart); Z. Bochdanovits (Zoltan); M. von Bonin (Malte); H. Pétursson (Hjörvar); O. Riess (Olaf); D.J. Burn (David); Lubbe, S. (Steven); Cooper, J.M. (J Mark); N.H. McNeill (Nathan); Schapira, A. (Anthony); Lungu, C. (Codrin); Chen, H. (Honglei); Dong, J. (Jing); Chinnery, P.F. (Patrick F.); G. Hudson (Gavin); Clarke, C.E. (Carl E.); C. Moorby (Catriona); C. Counsell (Carl); P. Damier (Philippe); J.-F. Dartigues; P. Deloukas (Panagiotis); E. Gray (Emma); T. Edkins (Ted); Hunt, S.E. (Sarah E.); S.C. Potter (Simon); A. Tashakkori-Ghanbaria (Avazeh); G. Deuschl (Günther); D. Lorenz (Delia); D.T. Dexter (David); F. Durif (Frank); J. Evans (Jonathan Mark); Langford, C. (Cordelia); T. Foltynie (Thomas); A.M. Goate (Alison); C. Harris (Clare); J.J. van Hilten (Jacobus); A. Hofman (Albert); J.R. Hollenbeck (John R.); J.L. Holton (Janice); Hu, M. (Michele); X. Huang (Xiaohong); Illig, T. (Thomas); P.V. Jónsson (Pálmi); J.-C. Lambert; S.S. O'Sullivan (Sean); T. Revesz (Tamas); K. Shaw (Karen); A.J. Lees (Andrew); P. Lichtner (Peter); P. Limousin (Patricia); G. Lopez; Escott-Price, V. (Valentina); J. Pearson (Justin); N. Williams (Nigel); E. Mudanohwo (Ese); J.S. Perlmutter (Joel); Pollak, P. (Pierre); F. Rivadeneira Ramirez (Fernando); A.G. Uitterlinden (André); S.J. Sawcer (Stephen); H. Scheffer (Hans); I. Shoulson (Ira); L. Shulman (Lee); Smith, C. (Colin); R. Walker (Robert); C.C.A. Spencer (Chris C.); A. Strange (Amy); H. Stefansson (Hreinn); F. Bettella (Francesco); J-A. Zwart (John-Anker); Stockton, J.D. (Joanna D.); D. Talbot; C.M. Tanner (Carlie); F. Tison (François); S. Winder-Rhodes (Sophie); K.P. Bhatia (Kailash)

    2017-01-01

    textabstractBackground: Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we perform

  4. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis.

    Science.gov (United States)

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests.

  5. Discovery of genes involved with learning and memory: an experimental synthesis of Hirschian and Benzerian perspectives.

    Science.gov (United States)

    Tully, T

    1996-11-26

    The biological bases of learning and memory are being revealed today with a wide array of molecular approaches, most of which entail the analysis of dysfunction produced by gene disruptions. This perspective derives both from early "genetic dissections" of learning in mutant Drosophila by Seymour Benzer and colleagues and from earlier behavior-genetic analyses of learning and in Diptera by Jerry Hirsh and coworkers. Three quantitative-genetic insights derived from these latter studies serve as guiding principles for the former. First, interacting polygenes underlie complex traits. Consequently, learning/memory defects associated with single-gene mutants can be quantified accurately only in equilibrated, heterogeneous genetic backgrounds. Second, complex behavioral responses will be composed of genetically distinct functional components. Thus, genetic dissection of complex traits into specific biobehavioral properties is likely. Finally, disruptions of genes involved with learning/memory are likely to have pleiotropic effects. As a result, task-relevant sensorimotor responses required for normal learning must be assessed carefully to interpret performance in learning/memory experiments. In addition, more specific conclusions will be obtained from reverse-genetic experiments, in which gene disruptions are restricted in time and/or space.

  6. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages.

    Directory of Open Access Journals (Sweden)

    Huanhuan Gao

    Full Text Available Bradysia odoriphaga (Diptera: Sciaridae is the most important pest of Chinese chive (Allium tuberosum in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125 bp contained in 7,632,430 contigs, 46.21% of which were annotated from non-redundant protein (NR, Gene Ontology (GO, Clusters of Orthologous Groups (COG, Eukaryotic Orthologous Groups (KOG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD. The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga.

  7. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    Science.gov (United States)

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  8. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    Science.gov (United States)

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  9. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Directory of Open Access Journals (Sweden)

    Adam Y Ye

    Full Text Available Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD (http://htd.cbi.pku.edu.cn. Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  10. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    Science.gov (United States)

    Ye, Adam Y; Liu, Qing-Rong; Li, Chuan-Yun; Zhao, Min; Qu, Hong

    2014-01-01

    Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD) (http://htd.cbi.pku.edu.cn). Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  11. Discovery and functional prioritization of Parkinson's disease candidate genes from large-scale whole exome sequencing

    NARCIS (Netherlands)

    I. Jansen (Iris); Ye, H. (Hui); Heetveld, S. (Sasja); Lechler, M.C. (Marie C.); Michels, H. (Helen); Seinstra, R.I. (Renée I.); Lubbe, S.J. (Steven J.); Drouet, V. (Valérie); S. Lesage (Suzanne); E. Majounie (Elisa); Gibbs, J.R. (J.Raphael); M.A. Nalls (Michael); M. Ryten (Mina); Botia, J.A. (Juan A.); J. Vandrovcova (Jana); J. Simón-Sánchez (Javier); Castillo-Lizardo, M. (Melissa); P. Rizzu (Patrizia); Blauwendraat, C. (Cornelis); Chouhan, A.K. (Amit K.); Li, Y. (Yarong); Yogi, P. (Puja); N. Amin (Najaf); C.M. van Duijn (Cock); Morris, H.R. (Huw R.); Brice, A. (Alexis); A. Singleton (Andrew); David, D.C. (Della C.); Nollen, E.A. (Ellen A.); A. Jain (Ashok); J.M. Shulman; P. Heutink (Peter); D.G. Hernandez (Dena); S. Arepalli (Sampath); J. Brooks (Janet); Price, R. (Ryan); Nicolas, A. (Aude); S. Chong (Sean); M.R. Cookson (Mark); A. Dillman (Allissa); M. Moore (Matt); B.J. Traynor (Bryan); A. Singleton (Andrew); V. Plagnol (Vincent); Nicholas W Wood,; U.-M. Sheerin (Una-Marie); Jose M Bras,; K. Charlesworth (Kate); M. Gardner (Mac); R. Guerreiro (Rita); D. Trabzuni (Danyah); Hardy, J. (John); M. Sharma; M. Saad (Mohamad); Javier Simón-Sánchez,; C. Schulte (Claudia); J.C. Corvol (Jean-Christophe); Dürr, A. (Alexandra); M. Vidailhet (M.); S. Sveinbjörnsdóttir (Sigurlaug); R.A. Barker (Roger); Caroline H Williams-Gray,; Y. Ben-Shlomo; H.W. Berendse (Henk W.); K.D. van Dijk (Karin); D. Berg (Daniela); K. Brockmann; K.D. Wurster (Kathrin); Mätzler, W. (Walter); Gasser, T. (Thomas); M. Martinez (Maria); R.M.A. de Bie (Rob); A. Biffi (Alessandro); D. Velseboer (Daan); B.R. Bloem (Bastiaan); B. Post (Bart); M. Wickremaratchi (Mirdhu); B. van de Warrenburg (Bart); Z. Bochdanovits (Zoltan); M. von Bonin (Malte); H. Pétursson (Hjörvar); O. Riess (Olaf); D.J. Burn (David); Lubbe, S. (Steven); Cooper, J.M. (J Mark); N.H. McNeill (Nathan); Schapira, A. (Anthony); Lungu, C. (Codrin); Chen, H. (Honglei); Dong, J. (Jing); Chinnery, P.F. (Patrick F.); G. Hudson (Gavin); Clarke, C.E. (Carl E.); C. Moorby (Catriona); C. Counsell (Carl); P. Damier (Philippe); J.-F. Dartigues; P. Deloukas (Panagiotis); E. Gray (Emma); T. Edkins (Ted); Hunt, S.E. (Sarah E.); S.C. Potter (Simon); A. Tashakkori-Ghanbaria (Avazeh); G. Deuschl (Günther); D. Lorenz (Delia); D.T. Dexter (David); F. Durif (Frank); J. Evans (Jonathan Mark); Langford, C. (Cordelia); T. Foltynie (Thomas); A.M. Goate (Alison); C. Harris (Clare); J.J. van Hilten (Jacobus); A. Hofman (Albert); J.R. Hollenbeck (John R.); J.L. Holton (Janice); Hu, M. (Michele); X. Huang (Xiaohong); Illig, T. (Thomas); P.V. Jónsson (Pálmi); J.-C. Lambert; S.S. O'Sullivan (Sean); T. Revesz (Tamas); K. Shaw (Karen); A.J. Lees (Andrew); P. Lichtner (Peter); P. Limousin (Patricia); G. Lopez; Escott-Price, V. (Valentina); J. Pearson (Justin); N. Williams (Nigel); E. Mudanohwo (Ese); J.S. Perlmutter (Joel); Pollak, P. (Pierre); F. Rivadeneira Ramirez (Fernando); A.G. Uitterlinden (André); S.J. Sawcer (Stephen); H. Scheffer (Hans); I. Shoulson (Ira); L. Shulman (Lee); Smith, C. (Colin); R. Walker (Robert); C.C.A. Spencer (Chris C.); A. Strange (Amy); H. Stefansson (Hreinn); F. Bettella (Francesco); J-A. Zwart (John-Anker); Stockton, J.D. (Joanna D.); D. Talbot; C.M. Tanner (Carlie); F. Tison (François); S. Winder-Rhodes (Sophie); K.P. Bhatia (Kailash)

    2017-01-01

    textabstractBackground: Whole-exome sequencing (WES) has been successful in identifying genes that cause familial Parkinson's disease (PD). However, until now this approach has not been deployed to study large cohorts of unrelated participants. To discover rare PD susceptibility variants, we perform

  12. Disease model discovery from 3,328 gene knockouts by The International Mouse Phenotyping Consortium.

    Science.gov (United States)

    Meehan, Terrence F; Conte, Nathalie; West, David B; Jacobsen, Julius O; Mason, Jeremy; Warren, Jonathan; Chen, Chao-Kung; Tudose, Ilinca; Relac, Mike; Matthews, Peter; Karp, Natasha; Santos, Luis; Fiegel, Tanja; Ring, Natalie; Westerberg, Henrik; Greenaway, Simon; Sneddon, Duncan; Morgan, Hugh; Codner, Gemma F; Stewart, Michelle E; Brown, James; Horner, Neil; Haendel, Melissa; Washington, Nicole; Mungall, Christopher J; Reynolds, Corey L; Gallegos, Juan; Gailus-Durner, Valerie; Sorg, Tania; Pavlovic, Guillaume; Bower, Lynette R; Moore, Mark; Morse, Iva; Gao, Xiang; Tocchini-Valentini, Glauco P; Obata, Yuichi; Cho, Soo Young; Seong, Je Kyung; Seavitt, John; Beaudet, Arthur L; Dickinson, Mary E; Herault, Yann; Wurst, Wolfgang; de Angelis, Martin Hrabe; Lloyd, K C Kent; Flenniken, Ann M; Nutter, Lauryl M J; Newbigging, Susan; McKerlie, Colin; Justice, Monica J; Murray, Stephen A; Svenson, Karen L; Braun, Robert E; White, Jacqueline K; Bradley, Allan; Flicek, Paul; Wells, Sara; Skarnes, William C; Adams, David J; Parkinson, Helen; Mallon, Ann-Marie; Brown, Steve D M; Smedley, Damian

    2017-08-01

    Although next-generation sequencing has revolutionized the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by a lack of knowledge of the functions and pathobiological mechanisms of most genes. To address this challenge, the International Mouse Phenotyping Consortium is creating a genome- and phenome-wide catalog of gene function by characterizing new knockout-mouse strains across diverse biological systems through a broad set of standardized phenotyping tests. All mice will be readily available to the biomedical community. Analyzing the first 3,328 genes identified models for 360 diseases, including the first models, to our knowledge, for type C Bernard-Soulier, Bardet-Biedl-5 and Gordon Holmes syndromes. 90% of our phenotype annotations were novel, providing functional evidence for 1,092 genes and candidates in genetically uncharacterized diseases including arrhythmogenic right ventricular dysplasia 3. Finally, we describe our role in variant functional validation with The 100,000 Genomes Project and others.

  13. The Application of Computer-Aided Discovery to Spacecraft Site Selection

    Science.gov (United States)

    Pankratius, V.; Blair, D. M.; Gowanlock, M.; Herring, T.

    2015-12-01

    The selection of landing and exploration sites for interplanetary robotic or human missions is a complex task. Historically it has been labor-intensive, with large groups of scientists manually interpreting a planetary surface across a variety of datasets to identify potential sites based on science and engineering constraints. This search process can be lengthy, and excellent sites may get overlooked when the aggregate value of site selection criteria is non-obvious or non-intuitive. As planetary data collection leads to Big Data repositories and a growing set of selection criteria, scientists will face a combinatorial search space explosion that requires scalable, automated assistance. We are currently exploring more general computer-aided discovery techniques in the context of planetary surface deformation phenomena that can lend themselves to application in the landing site search problem. In particular, we are developing a general software framework that addresses key difficulties: characterizing a given phenomenon or site based on data gathered from multiple instruments (e.g. radar interferometry, gravity, thermal maps, or GPS time series), and examining a variety of possible workflows whose individual configurations are optimized to isolate different features. The framework allows algorithmic pipelines and hypothesized models to be perturbed or permuted automatically within well-defined bounds established by the scientist. For example, even simple choices for outlier and noise handling or data interpolation can drastically affect the detectability of certain features. These techniques aim to automate repetitive tasks that scientists routinely perform in exploratory analysis, and make them more efficient and scalable by executing them in parallel in the cloud. We also explore ways in which machine learning can be combined with human feedback to prune the search space and converge to desirable results. Acknowledgements: We acknowledge support from NASA AIST

  14. Application of multiple statistical tests to enhance mass spectrometry-based biomarker discovery

    Directory of Open Access Journals (Sweden)

    Garner Harold R

    2009-05-01

    Full Text Available Abstract Background Mass spectrometry-based biomarker discovery has long been hampered by the difficulty in reconciling lists of discriminatory peaks identified by different laboratories for the same diseases studied. We describe a multi-statistical analysis procedure that combines several independent computational methods. This approach capitalizes on the strengths of each to analyze the same high-resolution mass spectral data set to discover consensus differential mass peaks that should be robust biomarkers for distinguishing between disease states. Results The proposed methodology was applied to a pilot narcolepsy study using logistic regression, hierarchical clustering, t-test, and CART. Consensus, differential mass peaks with high predictive power were identified across three of the four statistical platforms. Based on the diagnostic accuracy measures investigated, the performance of the consensus-peak model was a compromise between logistic regression and CART, which produced better models than hierarchical clustering and t-test. However, consensus peaks confer a higher level of confidence in their ability to distinguish between disease states since they do not represent peaks that are a result of biases to a particular statistical algorithm. Instead, they were selected as differential across differing data distribution assumptions, demonstrating their true discriminatory potential. Conclusion The methodology described here is applicable to any high-resolution MALDI mass spectrometry-derived data set with minimal mass drift which is essential for peak-to-peak comparison studies. Four statistical approaches with differing data distribution assumptions were applied to the same raw data set to obtain consensus peaks that were found to be statistically differential between the two groups compared. These consensus peaks demonstrated high diagnostic accuracy when used to form a predictive model as evaluated by receiver operating characteristics

  15. High capacity hydrogen storage materials: attributes for automotive applications and techniques for materials discovery.

    Science.gov (United States)

    Yang, Jun; Sudik, Andrea; Wolverton, Christopher; Siegel, Donald J

    2010-02-01

    Widespread adoption of hydrogen as a vehicular fuel depends critically upon the ability to store hydrogen on-board at high volumetric and gravimetric densities, as well as on the ability to extract/insert it at sufficiently rapid rates. As current storage methods based on physical means--high-pressure gas or (cryogenic) liquefaction--are unlikely to satisfy targets for performance and cost, a global research effort focusing on the development of chemical means for storing hydrogen in condensed phases has recently emerged. At present, no known material exhibits a combination of properties that would enable high-volume automotive applications. Thus new materials with improved performance, or new approaches to the synthesis and/or processing of existing materials, are highly desirable. In this critical review we provide a practical introduction to the field of hydrogen storage materials research, with an emphasis on (i) the properties necessary for a viable storage material, (ii) the computational and experimental techniques commonly employed in determining these attributes, and (iii) the classes of materials being pursued as candidate storage compounds. Starting from the general requirements of a fuel cell vehicle, we summarize how these requirements translate into desired characteristics for the hydrogen storage material. Key amongst these are: (a) high gravimetric and volumetric hydrogen density, (b) thermodynamics that allow for reversible hydrogen uptake/release under near-ambient conditions, and (c) fast reaction kinetics. To further illustrate these attributes, the four major classes of candidate storage materials--conventional metal hydrides, chemical hydrides, complex hydrides, and sorbent systems--are introduced and their respective performance and prospects for improvement in each of these areas is discussed. Finally, we review the most valuable experimental and computational techniques for determining these attributes, highlighting how an approach that

  16. Multi-class computational evolution: development, benchmark evaluation and application to RNA-Seq biomarker discovery.

    Science.gov (United States)

    Crabtree, Nathaniel M; Moore, Jason H; Bowyer, John F; George, Nysia I

    2017-01-01

    A computational evolution system (CES) is a knowledge discovery engine that can identify subtle, synergistic relationships in large datasets. Pareto optimization allows CESs to balance accuracy with model complexity when evolving classifiers. Using Pareto optimization, a CES is able to identify a very small number of features while maintaining high classification accuracy. A CES can be designed for various types of data, and the user can exploit expert knowledge about the classification problem in order to improve discrimination between classes. These characteristics give CES an advantage over other classification and feature selection algorithms, particularly when the goal is to identify a small number of highly relevant, non-redundant biomarkers. Previously, CESs have been developed only for binary class datasets. In this study, we developed a multi-class CES. The multi-class CES was compared to three common feature selection and classification algorithms: support vector machine (SVM), random k-nearest neighbor (RKNN), and random forest (RF). The algorithms were evaluated on three distinct multi-class RNA sequencing datasets. The comparison criteria were run-time, classification accuracy, number of selected features, and stability of selected feature set (as measured by the Tanimoto distance). The performance of each algorithm was data-dependent. CES performed best on the dataset with the smallest sample size, indicating that CES has a unique advantage since the accuracy of most classification methods suffer when sample size is small. The multi-class extension of CES increases the appeal of its application to complex, multi-class datasets in order to identify important biomarkers and features.

  17. Gene discovery in the threatened elkhorn coral: 454 sequencing of the Acropora palmata transcriptome.

    Directory of Open Access Journals (Sweden)

    Nicholas R Polato

    Full Text Available BACKGROUND: Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. RESULTS: A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83-100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (~18,000-20,000. The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. CONCLUSIONS: Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite

  18. Gene expression and epigenetic discovery screen reveal methylation of SFRP2 in prostate cancer.

    LENUS (Irish Health Repository)

    Perry, Antoinette S

    2013-04-15

    Aberrant activation of Wnts is common in human cancers, including prostate. Hypermethylation associated transcriptional silencing of Wnt antagonist genes SFRPs (Secreted Frizzled-Related Proteins) is a frequent oncogenic event. The significance of this is not known in prostate cancer. The objectives of our study were to (i) profile Wnt signaling related gene expression and (ii) investigate methylation of Wnt antagonist genes in prostate cancer. Using TaqMan Low Density Arrays, we identified 15 Wnt signaling related genes with significantly altered expression in prostate cancer; the majority of which were upregulated in tumors. Notably, histologically benign tissue from men with prostate cancer appeared more similar to tumor (r = 0.76) than to benign prostatic hyperplasia (BPH; r = 0.57, p < 0.001). Overall, the expression profile was highly similar between tumors of high (≥ 7) and low (≤ 6) Gleason scores. Pharmacological demethylation of PC-3 cells with 5-Aza-CdR reactivated 39 genes (≥ 2-fold); 40% of which inhibit Wnt signaling. Methylation frequencies in prostate cancer were 10% (2\\/20) (SFRP1), 64.86% (48\\/74) (SFRP2), 0% (0\\/20) (SFRP4) and 60% (12\\/20) (SFRP5). SFRP2 methylation was detected at significantly lower frequencies in high-grade prostatic intraepithelial neoplasia (HGPIN; 30%, (6\\/20), p = 0.0096), tumor adjacent benign areas (8.82%, (7\\/69), p < 0.0001) and BPH (11.43% (4\\/35), p < 0.0001). The quantitative level of SFRP2 methylation (normalized index of methylation) was also significantly higher in tumors (116) than in the other samples (HGPIN = 7.45, HB = 0.47, and BPH = 0.12). We show that SFRP2 hypermethylation is a common event in prostate cancer. SFRP2 methylation in combination with other epigenetic markers may be a useful biomarker of prostate cancer.

  19. Leveraging a Sturge-Weber Gene Discovery: An Agenda for Future Research.

    Science.gov (United States)

    Comi, Anne M; Sahin, Mustafa; Hammill, Adrienne; Kaplan, Emma H; Juhász, Csaba; North, Paula; Ball, Karen L; Levin, Alex V; Cohen, Bernard; Morris, Jill; Lo, Warren; Roach, E Steve

    2016-05-01

    Sturge-Weber syndrome (SWS) is a vascular neurocutaneous disorder that results from a somatic mosaic mutation in GNAQ, which is also responsible for isolated port-wine birthmarks. Infants with SWS are born with a cutaneous capillary malformation (port-wine birthmark) of the forehead or upper eyelid which can signal an increased risk of brain and/or eye involvement prior to the onset of specific symptoms. This symptom-free interval represents a time when a targeted intervention could help to minimize the neurological and ophthalmologic manifestations of the disorder. This paper summarizes a 2015 SWS workshop in Bethesda, Maryland that was sponsored by the National Institutes of Health. Meeting attendees included a diverse group of clinical and translational researchers with a goal of establishing research priorities for the next few years. The initial portion of the meeting included a thorough review of the recent genetic discovery and what is known of the pathogenesis of SWS. Breakout sessions related to neurology, dermatology, and ophthalmology aimed to establish SWS research priorities in each field. Key priorities for future development include the need for clinical consensus guidelines, further work to develop a clinical trial network, improvement of tissue banking for research purposes, and the need for multiple animal and cell culture models of SWS.

  20. The need for operating guidelines and a decision making framework applicable to the discovery of non-intelligent extraterrestrial life

    Science.gov (United States)

    Race, Margaret S.; Randolph, Richard O.

    While formal principles have been adopted for the eventuality of detecting intelligent life in our galaxy (SETI Principles), no such guidelines exist for the discovery of non-intelligent extraterrestrial life within the solar system. Current scientifically based planetary protection policies for solar system exploration address how to undertake exploration, but do not provide clear guidance on what to do if and when life is detected. Considering that martian life could be detected under several different robotic and human exploration scenarios in the coming decades, it is appropriate to anticipate how detection of non-intelligent, microbial life could impact future exploration missions and activities, especially on Mars. This paper discusses a proposed set of interim guidelines based loosely on the SETI Principles and addresses issues extending from the time of discovery through future handling and treatment of extraterrestrial life on Mars or elsewhere. Based on an analysis of both scientific and ethical considerations, there is a clear need for developing operating protocols applicable at the time of discovery and a decision making framework that anticipates future missions and activities, both robotic and human. There is growing scientific confidence that the discovery of extraterrestrial life in some form is nearly inevitable. If and when life is discovered beyond Earth, non-scientific dimensions may strongly influence decisions about the nature and scope of future missions and activities. It is appropriate to encourage international discussion and consideration of the issues prior to an event of such historical significance.

  1. Gene Therapy Applications in Gastroenterology and Hepatology

    Directory of Open Access Journals (Sweden)

    Catherine H Wu

    2000-01-01

    Full Text Available Advantages and disadvantages of viral vectors and nonviral vectors for gene delivery to digestive organs are reviewed. Advances in systems for the introduction of new gene expression are described, including self-deleting retroviral transfer vectors, chimeric viruses and chimeric oligonucleotides. Systems for inhibition of gene expression are discussed, including antisense oligonucleotides, ribozymes and dominant-negative genes.

  2. Gene discovery for improvement of kernel quality-related traits in maize

    Directory of Open Access Journals (Sweden)

    Motto M.

    2010-01-01

    Full Text Available Developing maize plants with improved kernel quality traits involves the ability to use existing genetic variation and to identify and manipulate commercially important genes. This will open avenues for designing novel variation in grain composition and will provide the basis for the development of the next generation of specialty maize. This paper provides an overview of current knowledge on the identification and exploitation of genes affecting the composition, development, and structure of the maize kernel with particular emphasis on pathways relevant to endosperm growth and development, differentiation of starch-filled cells, and biosynthesis of starches, storage proteins, lipids, and carotenoids. The potential that the new technologies of cell and molecular biology will provide for the creation of new variation in the future are also indicated and discussed.

  3. Discovery of Nuclear-Encoded Genes for the Neurotoxin Saxitoxin in Dinoflagellates

    OpenAIRE

    Anke Stüken; Orr, Russell J. S.; Ralf Kellmann; Murray, Shauna A.; Neilan, Brett A.; Kjetill S Jakobsen

    2011-01-01

    Saxitoxin is a potent neurotoxin that occurs in aquatic environments worldwide. Ingestion of vector species can lead to paralytic shellfish poisoning, a severe human illness that may lead to paralysis and death. In freshwaters, the toxin is produced by prokaryotic cyanobacteria; in marine waters, it is associated with eukaryotic dinoflagellates. However, several studies suggest that saxitoxin is not produced by dinoflagellates themselves, but by co-cultured bacteria. Here, we show that genes ...

  4. Discovery and functional assessment of gene variants in the vascular endothelial growth factor pathway

    OpenAIRE

    Paré-Brunet, Laia; Glubb, Dylan; Evans, Patrick; Berenguer-Llergo, Antoni; Etheridge, Amy S.; Skol, Andrew D.; Di Rienzo, Anna; Duan, Shiwei; Gamazon, Eric R.; Innocenti, Federico

    2013-01-01

    Angiogenesis is a host-mediated mechanism in disease pathophysiology. The vascular endothelial growth factor (VEGF) pathway is a major determinant of angiogenesis, and a comprehensive annotation of the functional variation in this pathway is essential to understand the genetic basis of angiogenesis-related diseases. We assessed the allelic heterogeneity of gene expression, population specificity of cis expression quantitative trait loci (eQTLs), and eQTL function in luciferase assays in CEU a...

  5. Gene discovery in an invasive tephritid model pest species, the Mediterranean fruit fly, Ceratitis capitata

    Science.gov (United States)

    Gomulski, Ludvik M; Dimopoulos, George; Xi, Zhiyong; Soares, Marcelo B; Bonaldo, Maria F; Malacrida, Anna R; Gasperi, Giuliano

    2008-01-01

    Background The medfly, Ceratitis capitata, is a highly invasive agricultural pest that has become a model insect for the development of biological control programs. Despite research into the behavior and classical and population genetics of this organism, the quantity of sequence data available is limited. We have utilized an expressed sequence tag (EST) approach to obtain detailed information on transcriptome signatures that relate to a variety of physiological systems in the medfly; this information emphasizes on reproduction, sex determination, and chemosensory perception, since the study was based on normalized cDNA libraries from embryos and adult heads. Results A total of 21,253 high-quality ESTs were obtained from the embryo and head libraries. Clustering analyses performed separately for each library resulted in 5201 embryo and 6684 head transcripts. Considering an estimated 19% overlap in the transcriptomes of the two libraries, they represent about 9614 unique transcripts involved in a wide range of biological processes and molecular functions. Of particular interest are the sequences that share homology with Drosophila genes involved in sex determination, olfaction, and reproductive behavior. The medfly transformer2 (tra2) homolog was identified among the embryonic sequences, and its genomic organization and expression were characterized. Conclusion The sequences obtained in this study represent the first major dataset of expressed genes in a tephritid species of agricultural importance. This resource provides essential information to support the investigation of numerous questions regarding the biology of the medfly and other related species and also constitutes an invaluable tool for the annotation of complete genome sequences. Our study has revealed intriguing findings regarding the transcript regulation of tra2 and other sex determination genes, as well as insights into the comparative genomics of genes implicated in chemosensory reception and

  6. Discovery of nuclear-encoded genes for the neurotoxin saxitoxin in dinoflagellates.

    Science.gov (United States)

    Stüken, Anke; Orr, Russell J S; Kellmann, Ralf; Murray, Shauna A; Neilan, Brett A; Jakobsen, Kjetill S

    2011-01-01

    Saxitoxin is a potent neurotoxin that occurs in aquatic environments worldwide. Ingestion of vector species can lead to paralytic shellfish poisoning, a severe human illness that may lead to paralysis and death. In freshwaters, the toxin is produced by prokaryotic cyanobacteria; in marine waters, it is associated with eukaryotic dinoflagellates. However, several studies suggest that saxitoxin is not produced by dinoflagellates themselves, but by co-cultured bacteria. Here, we show that genes required for saxitoxin synthesis are encoded in the nuclear genomes of dinoflagellates. We sequenced >1.2×10(6) mRNA transcripts from the two saxitoxin-producing dinoflagellate strains Alexandrium fundyense CCMP1719 and A. minutum CCMP113 using high-throughput sequencing technology. In addition, we used in silico transcriptome analyses, RACE, qPCR and conventional PCR coupled with Sanger sequencing. These approaches successfully identified genes required for saxitoxin-synthesis in the two transcriptomes. We focused on sxtA, the unique starting gene of saxitoxin synthesis, and show that the dinoflagellate transcripts of sxtA have the same domain structure as the cyanobacterial sxtA genes. But, in contrast to the bacterial homologs, the dinoflagellate transcripts are monocistronic, have a higher GC content, occur in multiple copies, contain typical dinoflagellate spliced-leader sequences and eukaryotic polyA-tails. Further, we investigated 28 saxitoxin-producing and non-producing dinoflagellate strains from six different genera for the presence of genomic sxtA homologs. Our results show very good agreement between the presence of sxtA and saxitoxin-synthesis, except in three strains of A. tamarense, for which we amplified sxtA, but did not detect the toxin. Our work opens for possibilities to develop molecular tools to detect saxitoxin-producing dinoflagellates in the environment.

  7. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    Science.gov (United States)

    2014-04-01

    candidate genes for existing prostate cancer (PC) risk-single nucleotide polymorphisms (SNPs) that could then be followed up in future studies. To accomplish...a radical prostatectomy at Mayo Clinic and were available to investigators through the Prostate Cancer SPORE. Typically, one to three pieces of...916 cases re-examined, 93 cases met the criteria above, but also contained Benign Prostatic Hyperplasia (BPH), seminal vesicle, urethra , or adjacent

  8. Discovery of inhibitors of aberrant gene transcription from Libraries of DNA binding molecules: inhibition of LEF-1-mediated gene transcription and oncogenic transformation.

    Science.gov (United States)

    Stover, James S; Shi, Jin; Jin, Wei; Vogt, Peter K; Boger, Dale L

    2009-03-11

    The screening of a >9000 compound library of synthetic DNA binding molecules for selective binding to the consensus sequence of the transcription factor LEF-1 followed by assessment of the candidate compounds in a series of assays that characterized functional activity (disruption of DNA-LEF-1 binding) at the intended target and site (inhibition of intracellular LEF-1-mediated gene transcription) resulting in a desired phenotypic cellular change (inhibit LEF-1-driven cell transformation) provided two lead compounds: lefmycin-1 and lefmycin-2. The sequence of screens defining the approach assures that activity in the final functional assay may be directly related to the inhibition of gene transcription and DNA binding properties of the identified molecules. Central to the implementation of this generalized approach to the discovery of DNA binding small molecule inhibitors of gene transcription was (1) the use of a technically nondemanding fluorescent intercalator displacement (FID) assay for initial assessment of the DNA binding affinity and selectivity of a library of compounds for any sequence of interest, and (2) the technology used to prepare a sufficiently large library of DNA binding compounds.

  9. Marine metagenomics: strategies for the discovery of novel enzymes with biotechnological applications from marine environments

    Directory of Open Access Journals (Sweden)

    Dobson Alan DW

    2008-08-01

    Full Text Available Abstract Metagenomic based strategies have previously been successfully employed as powerful tools to isolate and identify enzymes with novel biocatalytic activities from the unculturable component of microbial communities from various terrestrial environmental niches. Both sequence based and function based screening approaches have been employed to identify genes encoding novel biocatalytic activities and metabolic pathways from metagenomic libraries. While much of the focus to date has centred on terrestrial based microbial ecosystems, it is clear that the marine environment has enormous microbial biodiversity that remains largely unstudied. Marine microbes are both extremely abundant and diverse; the environments they occupy likewise consist of very diverse niches. As culture-dependent methods have thus far resulted in the isolation of only a tiny percentage of the marine microbiota the application of metagenomic strategies holds great potential to study and exploit the enormous microbial biodiversity which is present within these marine environments.

  10. A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.).

    Science.gov (United States)

    Varshney, Rajeev K; Hiremath, Pavana J; Lekha, Pazhamala; Kashiwagi, Junichi; Balaji, Jayashree; Deokar, Amit A; Vadez, Vincent; Xiao, Yongli; Srinivasan, Ramamurthy; Gaur, Pooran M; Siddique, Kadambot Hm; Town, Christopher D; Hoisington, David A

    2009-11-15

    Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with > or = 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of

  11. ShrimpGPAT: a gene and protein annotation tool for knowledge sharing and gene discovery in shrimp.

    Science.gov (United States)

    Korshkari, Parpakron; Vaiwsri, Sirintra; Flegel, Timothy W; Ngamsuriyaroj, Sudsanguan; Sonthayanon, Burachai; Prachumwat, Anuphap

    2014-06-21

    Although captured and cultivated marine shrimp constitute highly important seafood in terms of both economic value and production quantity, biologists have little knowledge of the shrimp genome and this partly hinders their ability to improve shrimp aquaculture. To help improve this situation, the Shrimp Gene and Protein Annotation Tool (ShrimpGPAT) was conceived as a community-based annotation platform for the acquisition and updating of full-length complementary DNAs (cDNAs), Expressed Sequence Tags (ESTs), transcript contigs and protein sequences of penaeid shrimp and their decapod relatives and for in-silico functional annotation and sequence analysis. ShrimpGPAT currently holds quality-filtered, molecular sequences of 14 decapod species (~500,000 records for six penaeid shrimp and eight other decapods). The database predominantly comprises transcript sequences derived by both traditional EST Sanger sequencing and more recently by massive-parallel sequencing technologies. The analysis pipeline provides putative functions in terms of sequence homologs, gene ontologies and protein-protein interactions. Data retrieval can be conducted easily either by a keyword text search or by a sequence query via BLAST, and users can save records of interest for later investigation using tools such as multiple sequence alignment and BLAST searches against pre-defined databases. In addition, ShrimpGPAT provides space for community insights by allowing functional annotation with tags and comments on sequences. Community-contributed information will allow for continuous database enrichment, for improvement of functions and for other aspects of sequence analysis. ShrimpGPAT is a new, free and easily accessed service for the shrimp research community that provides a comprehensive and up-to-date database of quality-filtered decapod gene and protein sequences together with putative functional prediction and sequence analysis tools. An important feature is its community

  12. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    Energy Technology Data Exchange (ETDEWEB)

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  13. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes.

    Directory of Open Access Journals (Sweden)

    Ying Tong

    Full Text Available The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs.The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.. Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs and 1,699 simple sequence repeats (SSRs were compiled.Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research on bivalve

  14. Gene delivery with cationic lipids : fundamentals and potential applications

    NARCIS (Netherlands)

    Wasungu, Luc Bakomma

    2006-01-01

    Principle of gene therapy. Although the objectives and principles of gene therapy have been well-defined over the last decades, its application as a versatile, therapeutically successful approach has not yet met expectations. At the onset, the primary goal of gene therapy was to replace a deficient

  15. Gene delivery with cationic lipids : fundamentals and potential applications

    NARCIS (Netherlands)

    Wasungu, Luc Bakomma

    2006-01-01

    Principle of gene therapy. Although the objectives and principles of gene therapy have been well-defined over the last decades, its application as a versatile, therapeutically successful approach has not yet met expectations. At the onset, the primary goal of gene therapy was to replace a deficient

  16. Gene delivery with cationic lipids : fundamentals and potential applications

    NARCIS (Netherlands)

    Wasungu, L.B.

    2006-01-01

    Principle of gene therapy.Although the objectives and principles of gene therapy have been well-defined over the last decades, its application as a versatile, therapeutically successful approach has not yet met expectations. At the onset, the primary goal of gene therapy was to replace a deficient g

  17. Gene discovery and molecular marker development, based on high-throughput transcript sequencing of Paspalum dilatatum Poir.

    Directory of Open Access Journals (Sweden)

    Andrea Giordano

    Full Text Available BACKGROUND: Paspalum dilatatum Poir. (common name dallisgrass is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. RESULTS: Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs. A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. CONCLUSIONS: The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression

  18. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome.

    Science.gov (United States)

    Hoffman, Joseph I

    2011-07-01

    Next-generation sequencing provides a powerful new approach for developing functional genomic tools for nonmodel species, helping to narrow the gap between studies of model organisms and those of natural populations. Consequently, massively parallel 454 sequencing was used to characterize a normalized cDNA library derived from skin biopsy samples of twelve Antarctic fur seal (Arctocephalus gazella) individuals. Over 412 Mb of sequence data were generated, comprising 1.4 million reads of average length 286 bp. De novo assembly using Newbler 2.3 yielded 156 contigs plus 22 869 isotigs, which in turn clustered into 18,576 isogroups. Almost half of the assembled transcript sequences showed significant similarity to the nr database, revealing a functionally diverse array of genes. Moreover, 97.9% of these mapped to the dog (Canis lupis familiaris) genome, with a strong positive relationship between the number of sequences locating to a given chromosome and the length of that chromosome in the dog indicating a broad genomic distribution. Average depth of coverage was also almost 20-fold, sufficient to detect several thousand putative microsatellite loci and single nucleotide polymorphisms. This study constitutes an important step towards developing genomic resources with which to address consequential questions in pinniped ecology and evolution. It also supports an earlier but smaller study showing that skin tissue can be a rich source of expressed genes, with important implications for studying the genomics not only of marine mammals, but also more generally of species that cannot be destructively sampled.

  19. Gene discovery and transcript analyses in the corn smut pathogen Ustilago maydis: expressed sequence tag and genome sequence comparison

    Directory of Open Access Journals (Sweden)

    Saville Barry J

    2007-09-01

    Full Text Available Abstract Background Ustilago maydis is the basidiomycete fungus responsible for common smut of corn and is a model organism for the study of fungal phytopathogenesis. To aid in the annotation of the genome sequence of this organism, several expressed sequence tag (EST libraries were generated from a variety of U. maydis cell types. In addition to utility in the context of gene identification and structure annotation, the ESTs were analyzed to identify differentially abundant transcripts and to detect evidence of alternative splicing and anti-sense transcription. Results Four cDNA libraries were constructed using RNA isolated from U. maydis diploid teliospores (U. maydis strains 518 × 521 and haploid cells of strain 521 grown under nutrient rich, carbon starved, and nitrogen starved conditions. Using the genome sequence as a scaffold, the 15,901 ESTs were assembled into 6,101 contiguous expressed sequences (contigs; among these, 5,482 corresponded to predicted genes in the MUMDB (MIPS Ustilago maydis database, while 619 aligned to regions of the genome not yet designated as genes in MUMDB. A comparison of EST abundance identified numerous genes that may be regulated in a cell type or starvation-specific manner. The transcriptional response to nitrogen starvation was assessed using RT-qPCR. The results of this suggest that there may be cross-talk between the nitrogen and carbon signalling pathways in U. maydis. Bioinformatic analysis identified numerous examples of alternative splicing and anti-sense transcription. While intron retention was the predominant form of alternative splicing in U. maydis, other varieties were also evident (e.g. exon skipping. Selected instances of both alternative splicing and anti-sense transcription were independently confirmed using RT-PCR. Conclusion Through this work: 1 substantial sequence information has been provided for U. maydis genome annotation; 2 new genes were identified through the discovery of 619

  20. Discovery and characterization of a novel CCND1/MRCK gene fusion in mantle cell lymphoma

    Directory of Open Access Journals (Sweden)

    Chioniso Patience Masamha

    2016-03-01

    Full Text Available Abstract The t(11;14 translocation resulting in constitutive cyclin D1 expression is an early event in mantle cell lymphoma (MCL transformation. Patients with a highly proliferative phenotype produce cyclin D1 transcripts with truncated 3′UTRs that evade miRNA regulation. Here, we report the recurrence of a novel gene fusion in MCL cell lines and MCL patient isolates that consists of the full protein coding region of cyclin D1 (CCND1 and a 3′UTR consisting of sequences from both the CCND1 3′UTR and myotonic dystrophy kinase-related Cdc42-binding kinase's (MRCK intron one. The resulting CCND1/MRCK mRNA is resistant to CCND1-targeted miRNA regulation, and targeting the MRCK region of the chimeric 3′UTR with siRNA results in decreased CCND1 levels.

  1. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    Directory of Open Access Journals (Sweden)

    Younhee Shin

    2016-09-01

    Full Text Available Hanwoo, a Korean native cattle (Bos taurus coreana, has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3% and 982,674 (40.9% novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1 and 28,613 SNPs (Btau 4.6.1 that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding.

  2. Discovery and characterization of the first genuine avian leptin gene in the rock dove (Columba livia).

    Science.gov (United States)

    Friedman-Einat, Miriam; Cogburn, Larry A; Yosefi, Sara; Hen, Gideon; Shinder, Dmitry; Shirak, Andrey; Seroussi, Eyal

    2014-09-01

    Leptin, the key regulator of mammalian energy balance, has been at the center of a great controversy in avian biology for the last 15 years since initial reports of a putative leptin gene (LEP) in chickens. Here, we characterize a novel LEP in rock dove (Columba livia) with low similarity of the predicted protein sequence (30% identity, 47% similarity) to the human ortholog. Searching the Sequence-Read-Archive database revealed leptin transcripts, in the dove's liver, with 2 noncoding exons preceding 2 coding exons. This unusual 4-exon structure was validated by sequencing of a GC-rich product (76% GC, 721 bp) amplified from liver RNA by RT-PCR. Sequence alignment of the dove leptin with orthologous leptins indicated that it consists of a leader peptide (21 amino acids; aa) followed by the mature protein (160 aa), which has a putative structure typical of 4-helical-bundle cytokines except that it is 12 aa longer than human leptin. Extra residues (10 aa) were located within the loop between 2 5'-helices, interrupting the amino acid motif that is conserved in tetrapods and considered essential for activation of leptin receptor (LEPR) but not for receptor binding per se. Quantitative RT-PCR of 11 tissues showed highest (P < .05) expression of LEP in the dove's liver, whereas the dove LEPR peaked (P < .01) in the pituitary. Both genes were prominently expressed in the gonads and at lower levels in tissues involved in mammalian leptin signaling (adipose; hypothalamus). A bioassay based on activation of the chicken LEPR in vitro showed leptin activity in the dove's circulation, suggesting that dove LEP encodes an active protein, despite the interrupted loop motif. Providing tools to study energy-balance control at an evolutionary perspective, our original demonstration of leptin signaling in dove predicts a more ancient role of leptin in growth and reproduction in birds, rather than appetite control.

  3. The Effect of Discovery Learning Method Application on Increasing Students' Listening Outcome and Social Attitude

    Science.gov (United States)

    Hanafi

    2016-01-01

    Curriculum of 2013 has been started in schools appointed as the implementer. This curriculum, for English subject demands the students to improve their skills. To reach this one of the suggested methods is discovery learning since this method is considered appropriate to implement for increasing the students' ability especially to fulfill minimum…

  4. Application of Knowledge Discovery in Databases Methodologies for Predictive Models for Pregnancy Adverse Events

    Science.gov (United States)

    Taft, Laritza M.

    2010-01-01

    In its report "To Err is Human", The Institute of Medicine recommended the implementation of internal and external voluntary and mandatory automatic reporting systems to increase detection of adverse events. Knowledge Discovery in Databases (KDD) allows the detection of patterns and trends that would be hidden or less detectable if analyzed by…

  5. Genes2GO: A web application for querying gene sets for specific GO terms.

    Science.gov (United States)

    Chawla, Konika; Kuiper, Martin

    2016-01-01

    Gene ontology annotations have become an essential resource for biological interpretations of experimental findings. The process of gathering basic annotation information in tables that link gene sets with specific gene ontology terms can be cumbersome, in particular if it requires above average computer skills or bioinformatics expertise. We have therefore developed Genes2GO, an intuitive R-based web application. Genes2GO uses the biomaRt package of Bioconductor in order to retrieve custom sets of gene ontology annotations for any list of genes from organisms covered by the Ensembl database. Genes2GO produces a binary matrix file, indicating for each gene the presence or absence of specific annotations for a gene. It should be noted that other GO tools do not offer this user-friendly access to annotations. Genes2GO is freely available and listed under http://www.semantic-systems-biology.org/tools/externaltools/.

  6. State-of-the-art human gene therapy: part II. Gene therapy strategies and clinical applications.

    Science.gov (United States)

    Wang, Dan; Gao, Guangping

    2014-09-01

    In Part I of this Review (Wang and Gao, 2014), we introduced recent advances in gene delivery technologies and explained how they have powered some of the current human gene therapy applications. In Part II, we expand the discussion on gene therapy applications, focusing on some of the most exciting clinical uses. To help readers to grasp the essence and to better organize the diverse applications, we categorize them under four gene therapy strategies: (1) gene replacement therapy for monogenic diseases, (2) gene addition for complex disorders and infectious diseases, (3) gene expression alteration targeting RNA, and (4) gene editing to introduce targeted changes in host genome. Human gene therapy started with the simple idea that replacing a faulty gene with a functional copy can cure a disease. It has been a long and bumpy road to finally translate this seemingly straightforward concept into reality. As many disease mechanisms unraveled, gene therapists have employed a gene addition strategy backed by a deep knowledge of what goes wrong in diseases and how to harness host cellular machinery to battle against diseases. Breakthroughs in other biotechnologies, such as RNA interference and genome editing by chimeric nucleases, have the potential to be integrated into gene therapy. Although clinical trials utilizing these new technologies are currently sparse, these innovations are expected to greatly broaden the scope of gene therapy in the near future.

  7. Systems biology discoveries using non-human primate pluripotent stem and germ cells: novel gene and genomic imprinting interactions as well as unique expression patterns.

    Science.gov (United States)

    Ben-Yehudah, Ahmi; Easley, Charles A; Hermann, Brian P; Castro, Carlos; Simerly, Calvin; Orwig, Kyle E; Mitalipov, Shoukhrat; Schatten, Gerald

    2010-08-05

    The study of pluripotent stem cells has generated much interest in both biology and medicine. Understanding the fundamentals of biological decisions, including what permits a cell to maintain pluripotency, that is, its ability to self-renew and thereby remain immortal, or to differentiate into multiple types of cells, is of profound importance. For clinical applications, pluripotent cells, including both embryonic stem cells and adult stem cells, have been proposed for cell replacement therapy for a number of human diseases and disorders, including Alzheimer's, Parkinson's, spinal cord injury and diabetes. One challenge in their usage for such therapies is understanding the mechanisms that allow the maintenance of pluripotency and controlling the specific differentiation into required functional target cells. Because of regulatory restrictions and biological feasibilities, there are many crucial investigations that are just impossible to perform using pluripotent stem cells (PSCs) from humans (for example, direct comparisons among panels of inbred embryonic stem cells from prime embryos obtained from pedigreed and fertile donors; genomic analysis of parent versus progeny PSCs and their identical differentiated tissues; intraspecific chimera analyses for pluripotency testing; and so on). However, PSCs from nonhuman primates are being investigated to bridge these knowledge gaps between discoveries in mice and vital information necessary for appropriate clinical evaluations. In this review, we consider the mRNAs and novel genes with unique expression and imprinting patterns that were discovered using systems biology approaches with primate pluripotent stem and germ cells.

  8. Chemiluminescent detection of sequential DNA hybridizations to high-density, filter-arrayed cDNA libraries: a subtraction method for novel gene discovery.

    Science.gov (United States)

    Guiliano, D; Ganatra, M; Ware, J; Parrot, J; Daub, J; Moran, L; Brennecke, H; Foster, J M; Supali, T; Blaxter, M; Scott, A L; Williams, S A; Slatko, B E

    1999-07-01

    A chemiluminescent approach for sequential DNA hybridizations to high-density filter arrays of cDNAs, using a biotin-based random priming method followed by a streptavidin/alkaline phosphatase/CDP-Star detection protocol, is presented. The method has been applied to the Brugia malayi genome project, wherein cDNA libraries, cosmid and bacterial artificial chromosome (BAC) libraries have been gridded at high density onto nylon filters for subsequent analysis by hybridization. Individual probes and pools of rRNA probes, ribosomal protein probes and expressed sequence tag probes show correct specificity and high signal-to-noise ratios even after ten rounds of hybridization, detection, stripping of the probes from the membranes and rehybridization with additional probe sets. This approach provides a subtraction method that leads to a reduction in redundant DNA sequencing, thus increasing the rate of novel gene discovery. The method is also applicable for detecting target sequences, which are present in one or only a few copies per cell; it has proven useful for physical mapping of BAC and cosmid high-density filter arrays, wherein multiple probes have been hybridized at one time (multiplexed) and subsequently "deplexed" into individual components for specific probe localizations.

  9. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling

    Institute of Scientific and Technical Information of China (English)

    Xiaosheng Wang

    2013-01-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer.The cancer informatics approach is a useful supplement to the traditional experimental approach.I reviewed several reports that used a bioinformatics approach to analyze the associations among aging,stem cells,and cancer by microarray gene expression profiling.The high expression of aging-or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging,stem cells,and cancer.These mechanisms are involved in cell cycle regulation,metabolic process,DNA damage response,apoptosis,p53 signaling pathway,immune/inflammatory response,and other processes,suggesting that cancer is a developmental and evolutional disease that is strongly related to aging.Moreover,these mechanisms demonstrate that the initiation,proliferation,and metastasis of cancer are associated with the deregulation of stem cells.These findings provide insights into the biology of cancer.Certainly,the findings that are obtained by the informatics approach should be justified by experimental validation.This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  10. Identification and Validation of HCC-specific Gene Transcriptional Signature for Tumor Antigen Discovery.

    Science.gov (United States)

    Petrizzo, Annacarmen; Caruso, Francesca Pia; Tagliamonte, Maria; Tornesello, Maria Lina; Ceccarelli, Michele; Costa, Valerio; Aprile, Marianna; Esposito, Roberta; Ciliberto, Gennaro; Buonaguro, Franco M; Buonaguro, Luigi

    2016-07-08

    A novel two-step bioinformatics strategy was applied for identification of signatures with therapeutic implications in hepatitis-associated HCC. Transcriptional profiles from HBV- and HCV-associated HCC samples were compared with non-tumor liver controls. Resulting HCC modulated genes were subsequently compared with different non-tumor tissue samples. Two related signatures were identified, namely "HCC-associated" and "HCC-specific". Expression data were validated by RNA-Seq analysis carried out on unrelated HCC samples and protein expression was confirmed according to The Human Protein Atlas" (http://proteinatlas.org/), a public repository of immunohistochemistry data. Among all, aldo-keto reductase family 1 member B10, and IGF2 mRNA-binding protein 3 were found strictly HCC-specific with no expression in 18/20 normal tissues. Target peptides for vaccine design were predicted for both proteins associated with the most prevalent HLA-class I and II alleles. The described novel strategy showed to be feasible for identification of HCC-specific proteins as highly potential target for HCC immunotherapy.

  11. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling.

    Science.gov (United States)

    Wang, Xiaosheng

    2013-04-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer. The cancer informatics approach is a useful supplement to the traditional experimental approach. I reviewed several reports that used a bioinformatics approach to analyze the associations among aging, stem cells, and cancer by microarray gene expression profiling. The high expression of aging- or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging, stem cells, and cancer. These mechanisms are involved in cell cycle regulation, metabolic process, DNA damage response, apoptosis, p53 signaling pathway, immune/inflammatory response, and other processes, suggesting that cancer is a developmental and evolutional disease that is strongly related to aging. Moreover, these mechanisms demonstrate that the initiation, proliferation, and metastasis of cancer are associated with the deregulation of stem cells. These findings provide insights into the biology of cancer. Certainly, the findings that are obtained by the informatics approach should be justified by experimental validation. This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  12. Application of multidisciplinary analysis to gene expression.

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Xuefel (University of New Mexico, Albuquerque, NM); Kang, Huining (University of New Mexico, Albuquerque, NM); Fields, Chris (New Mexico State University, Las Cruces, NM); Cowie, Jim R. (New Mexico State University, Las Cruces, NM); Davidson, George S.; Haaland, David Michael; Sibirtsev, Valeriy (New Mexico State University, Las Cruces, NM); Mosquera-Caro, Monica P. (University of New Mexico, Albuquerque, NM); Xu, Yuexian (University of New Mexico, Albuquerque, NM); Martin, Shawn Bryan; Helman, Paul (University of New Mexico, Albuquerque, NM); Andries, Erik (University of New Mexico, Albuquerque, NM); Ar, Kerem (University of New Mexico, Albuquerque, NM); Potter, Jeffrey (University of New Mexico, Albuquerque, NM); Willman, Cheryl L. (University of New Mexico, Albuquerque, NM); Murphy, Maurice H. (University of New Mexico, Albuquerque, NM)

    2004-01-01

    Molecular analysis of cancer, at the genomic level, could lead to individualized patient diagnostics and treatments. The developments to follow will signal a significant paradigm shift in the clinical management of human cancer. Despite our initial hopes, however, it seems that simple analysis of microarray data cannot elucidate clinically significant gene functions and mechanisms. Extracting biological information from microarray data requires a complicated path involving multidisciplinary teams of biomedical researchers, computer scientists, mathematicians, statisticians, and computational linguists. The integration of the diverse outputs of each team is the limiting factor in the progress to discover candidate genes and pathways associated with the molecular biology of cancer. Specifically, one must deal with sets of significant genes identified by each method and extract whatever useful information may be found by comparing these different gene lists. Here we present our experience with such comparisons, and share methods developed in the analysis of an infant leukemia cohort studied on Affymetrix HG-U95A arrays. In particular, spatial gene clustering, hyper-dimensional projections, and computational linguistics were used to compare different gene lists. In spatial gene clustering, different gene lists are grouped together and visualized on a three-dimensional expression map, where genes with similar expressions are co-located. In another approach, projections from gene expression space onto a sphere clarify how groups of genes can jointly have more predictive power than groups of individually selected genes. Finally, online literature is automatically rearranged to present information about genes common to multiple groups, or to contrast the differences between the lists. The combination of these methods has improved our understanding of infant leukemia. While the complicated reality of the biology dashed our initial, optimistic hopes for simple answers from

  13. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    .8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.

  14. Technical guide for applications of gene expression profiling in human health risk assessment of environmental chemicals.

    Science.gov (United States)

    Bourdon-Lacombe, Julie A; Moffat, Ivy D; Deveau, Michelle; Husain, Mainul; Auerbach, Scott; Krewski, Daniel; Thomas, Russell S; Bushel, Pierre R; Williams, Andrew; Yauk, Carole L

    2015-07-01

    Toxicogenomics promises to be an important part of future human health risk assessment of environmental chemicals. The application of gene expression profiles (e.g., for hazard identification, chemical prioritization, chemical grouping, mode of action discovery, and quantitative analysis of response) is growing in the literature, but their use in formal risk assessment by regulatory agencies is relatively infrequent. Although additional validations for specific applications are required, gene expression data can be of immediate use for increasing confidence in chemical evaluations. We believe that a primary reason for the current lack of integration is the limited practical guidance available for risk assessment specialists with limited experience in genomics. The present manuscript provides basic information on gene expression profiling, along with guidance on evaluating the quality of genomic experiments and data, and interpretation of results presented in the form of heat maps, pathway analyses and other common approaches. Moreover, potential ways to integrate information from gene expression experiments into current risk assessment are presented using published studies as examples. The primary objective of this work is to facilitate integration of gene expression data into human health risk assessments of environmental chemicals. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  15. Haplotypes in the lipoprotein lipase gene influence fasting insulin and discovery of a new risk haplotype.

    Science.gov (United States)

    Goodarzi, Mark O; Taylor, Kent D; Guo, Xiuqing; Hokanson, John E; Haffner, Steven M; Cui, Jinrui; Chen, Yii-Der I; Wagenknecht, Lynne E; Bergman, Richard N; Rotter, Jerome I

    2007-01-01

    Prior studies of Mexican Americans described association of lipoprotein lipase (LPL) gene haplotypes with insulin sensitivity/resistance and atherosclerosis. The most common haplotype (haplotype 1) was protective, whereas the fourth most common haplotype (haplotype 4) conferred risk for insulin resistance and atherosclerosis. In this study of Hispanics in the Insulin Resistance Atherosclerosis Study Family Study, we sought to replicate LPL haplotype association with insulin sensitivity/resistance. LPL haplotypes based on 12 single nucleotide polymorphisms were analyzed for association with indexes of insulin sensitivity and other metabolic and adiposity measures. This study was conducted in the general community of San Antonio, Texas, and San Luis Valley, Colorado. Participants in this study were 978 members of 86 Hispanic families. LPL haplogenotype, metabolic phenotypes, and adiposity were measured in this study. The haplotype structure was identical with that observed in prior studies. Among 978 phenotyped subjects, haplotype 1 was associated with decreased fasting insulin (P = 0.01), and haplotype 4 was associated with increased fasting insulin (P = 0.02) and increased visceral fat mass (P = 0.002). Insulin sensitivity, derived from iv glucose tolerance testing, tended (P > 0.1) to be higher with haplotype 1 (S(I) = 1.72) and lower with haplotype 4 (S(I)=1.38). Haplotype 2 was associated with increases in fasting insulin, triglycerides (TGs), TG to high-density lipoprotein-cholesterol ratio, and apolipoprotein B (P = 0.01-0.04). This study independently replicates our prior results of LPL haplotypes 1 and 4 as associated with measures of insulin sensitivity and resistance, respectively. Haplotype 4 may confer insulin resistance by increasing visceral fat. Haplotype 2 was identified as a new risk haplotype, suggesting the complex nature of LPL's effect on features of the insulin resistance syndrome.

  16. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes.

    Science.gov (United States)

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.

  17. The fragile x mental retardation syndrome 20 years after the FMR1 gene discovery: an expanding universe of knowledge.

    Science.gov (United States)

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-08-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations.

  18. Practical Applications of the Gene Ontology Resource

    Science.gov (United States)

    Huntley, Rachael P.; Dimmer, Emily C.; Apweiler, Rolf

    The Gene Ontology (GO) is a controlled vocabulary that represents knowledge about the functional attributes of gene products in a structured manner and can be used in both computational and human analyses. This vocabulary has been used by diverse curation groups to associate functional information to individual gene products in the form of annotations. GO has proven an invaluable resource for evaluating and interpreting the biological significance of large data sets, enabling researchers to create hypotheses to direct their future research. This chapter provides an overview of the Gene Ontology, how it can be used, and tips on getting the most out of GO analyses.

  19. Industrial lab-on-a-chip: design, applications and scale-up for drug discovery and delivery.

    Science.gov (United States)

    Vladisavljević, Goran T; Khalid, Nauman; Neves, Marcos A; Kuroiwa, Takashi; Nakajima, Mitsutoshi; Uemura, Kunihiko; Ichikawa, Sosaku; Kobayashi, Isao

    2013-11-01

    Microfluidics is an emerging and promising interdisciplinary technology which offers powerful platforms for precise production of novel functional materials (e.g., emulsion droplets, microcapsules, and nanoparticles as drug delivery vehicles- and drug molecules) as well as high-throughput analyses (e.g., bioassays, detection, and diagnostics). In particular, multiphase microfluidics is a rapidly growing technology and has beneficial applications in various fields including biomedicals, chemicals, and foods. In this review, we first describe the fundamentals and latest developments in multiphase microfluidics for producing biocompatible materials that are precisely controlled in size, shape, internal morphology and composition. We next describe some microfluidic applications that synthesize drug molecules, handle biological substances and biological units, and imitate biological organs. We also highlight and discuss design, applications and scale up of droplet- and flow-based microfluidic devices used for drug discovery and delivery. © 2013.

  20. Gene editing and its application for hematological diseases.

    Science.gov (United States)

    Osborn, Mark J; Belanto, Joseph J; Tolar, Jakub; Voytas, Daniel F

    2016-07-01

    The use of precise, rationally designed gene-editing nucleases allows for targeted genome and transcriptome modification, and at present, four major classes of nucleases are being employed: zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), meganucleases (MNs), and clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9. Each reagent shares the ability to recognize and bind a target sequence of DNA. Depending on the properties of the reagent, the DNA can be cleaved on one or both strands, or epigenetic changes can be mediated. These novel properties can impact hematological disease by allowing for: (1) direct modification of hematopoietic stem/progenitor cells (HSPCs), (2) gene alteration of hematopoietic lineage committed terminal effectors, (3) genome engineering in non-hematopoietic cells with reprogramming to a hematopoietic phenotype, and (4) transcriptome modulation for gene regulation, modeling, and discovery.

  1. The application of the open pharmacological concepts triple store (open PHACTS to support drug discovery research.

    Directory of Open Access Journals (Sweden)

    Joseline Ratnam

    Full Text Available Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1 the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2 the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB signaling pathway that have a relevance to disease and 3 the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery.

  2. Voltage-gated sodium channel gene repertoire of lampreys: gene duplications, tissue-specific expression and discovery of a long-lost gene.

    Science.gov (United States)

    Zakon, Harold H; Li, Weiming; Pillai, Nisha E; Tohari, Sumanty; Shingate, Prashant; Ren, Jianfeng; Venkatesh, Byrappa

    2017-09-27

    Studies of the voltage-gated sodium (Nav) channels of extant gnathostomes have made it possible to deduce that ancestral gnathostomes possessed four voltage-gated sodium channel genes derived from a single ancestral chordate gene following two rounds of genome duplication early in vertebrates. We investigated the Nav gene family in two species of lampreys (the Japanese lamprey Lethenteron japonicum and sea lamprey Petromyzon marinus) (jawless vertebrates-agnatha) and compared them with those of basal vertebrates to better understand the origin of Nav genes in vertebrates. We noted six Nav genes in both lamprey species, but orthology with gnathostome (jawed vertebrate) channels was inconclusive. Surprisingly, the Nav2 gene, ubiquitously found in invertebrates and believed to have been lost in vertebrates, is present in lampreys, elephant shark (Callorhinchus milii) and coelacanth (Latimeria chalumnae). Despite repeated duplication of the Nav1 family in vertebrates, Nav2 is only in single copy in those vertebrates in which it is retained, and was independently lost in ray-finned fishes and tetrapods. Of the other five Nav channel genes, most were expressed in brain, one in brain and heart, and one exclusively in skeletal muscle. Invertebrates do not express Nav channel genes in muscle. Thus, early in the vertebrate lineage Nav channels began to diversify and different genes began to express in heart and muscle. © 2017 The Author(s).

  3. U.S. Geological Survey Ecosystems science strategy: advancing discovery and application through collaboration

    Science.gov (United States)

    Williams, Byron K.; Wingard, G. Lynn; Brewer, Gary; Cloern, James E.; Gelfenbaum, Guy; Jacobson, Robert B.; Kershner, Jeffrey L.; McGuire, Anthony David; Nichols, James D.; Shapiro, Carl D.; van Riper, Charles; White, Robin P.

    2013-01-01

    Ecosystem science is critical to making informed decisions about natural resources that can sustain our Nation’s economic and environmental well-being. Resource managers and policymakers are faced with countless decisions each year at local, regional, and national levels on issues as diverse as renewable and nonrenewable energy development, agriculture, forestry, water supply, and resource allocations at the urbanrural interface. The urgency for sound decisionmaking is increasing dramatically as the world is being transformed at an unprecedented pace and in uncertain directions. Environmental changes are associated with natural hazards, greenhouse gas emissions, and increasing demands for water, land, food, energy, mineral, and living resources. At risk is the Nation’s environmental capital, the goods and services provided by resilient ecosystems that are vital to the health and wellbeing of human societies. Ecosystem science—the study of systems of organisms interacting with their environment and the consequences of natural and human-induced change on these systems—is necessary to inform decisionmakers as they develop policies to adapt to these changes. This Ecosystems Science Strategy is built on a framework that includes basic and applied science. It highlights the critical roles that U.S. Geological Survey (USGS) scientists and partners can play in building scientific understanding and providing timely information to decisionmakers. The strategy underscores the connection between scientific discoveries and the application of new knowledge, and it integrates ecosystem science and decisionmaking, producing new scientific outcomes to assist resource managers and providing public benefits. We envision the USGS as a leader in integrating scientific information into decisionmaking processes that affect the Nation’s natural resources and human well-being. The USGS is uniquely positioned to play a pivotal role in ecosystem science. With its wide range of

  4. Identifying and quantifying heterogeneity in high content analysis: application of heterogeneity indices to drug discovery.

    Directory of Open Access Journals (Sweden)

    Albert H Gough

    Full Text Available One of the greatest challenges in biomedical research, drug discovery and diagnostics is understanding how seemingly identical cells can respond differently to perturbagens including drugs for disease treatment. Although heterogeneity has become an accepted characteristic of a population of cells, in drug discovery it is not routinely evaluated or reported. The standard practice for cell-based, high content assays has been to assume a normal distribution and to report a well-to-well average value with a standard deviation. To address this important issue we sought to define a method that could be readily implemented to identify, quantify and characterize heterogeneity in cellular and small organism assays to guide decisions during drug discovery and experimental cell/tissue profiling. Our study revealed that heterogeneity can be effectively identified and quantified with three indices that indicate diversity, non-normality and percent outliers. The indices were evaluated using the induction and inhibition of STAT3 activation in five cell lines where the systems response including sample preparation and instrument performance were well characterized and controlled. These heterogeneity indices provide a standardized method that can easily be integrated into small and large scale screening or profiling projects to guide interpretation of the biology, as well as the development of therapeutics and diagnostics. Understanding the heterogeneity in the response to perturbagens will become a critical factor in designing strategies for the development of therapeutics including targeted polypharmacology.

  5. Computational drug discovery

    Institute of Scientific and Technical Information of China (English)

    Si-sheng OU-YANG; Jun-yan LU; Xiang-qian KONG; Zhong-jie LIANG; Cheng LUO; Hualiang JIANG

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process.Because of the dramatic increase in the availability of biological macromolecule and small molecule information,the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow,including target identification and validation,lead discovery and optimization and preclinical tests.Over the past decades,computational drug discovery methods such as molecular docking,pharmacophore modeling and mapping,de novo design,molecular similarity calculation and sequence-based virtual screening have been greatly improved.In this review,we present an overview of these important computational methods,platforms and successful applications in this field.

  6. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Science.gov (United States)

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  7. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    Directory of Open Access Journals (Sweden)

    Theresa A Hill

    Full Text Available The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs. Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP. Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and

  8. Informatics for materials science and engineering data-driven discovery for accelerated experimentation and application

    CERN Document Server

    Rajan, Krishna

    2014-01-01

    Materials informatics: a 'hot topic' area in materials science, aims to combine traditionally bio-led informatics with computational methodologies, supporting more efficient research by identifying strategies for time- and cost-effective analysis. The discovery and maturation of new materials has been outpaced by the thicket of data created by new combinatorial and high throughput analytical techniques. The elaboration of this ""quantitative avalanche""-and the resulting complex, multi-factor analyses required to understand it-means that interest, investment, and research are revisiting in

  9. Methodologies and Application of New Target Identification, Drug Action Mechanism Investigation and New Molecular Entity Discovery

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    @@ The group, headed by Prof.JIANG Hualiang with the CAS Shanghai Institute of Materia Medica, has been centering on the basic research of pharmaceutical science, including identifying new targets, studying new drug action mechanisms and discovering new drug candidates.On the basis of new methodology development, an effective multi-disciplinary research platform for drug research and discovery has been established through the integration of different disciplines of computational chemistry, organic synthesis, molecular and cellular biology.A bunch of creative results have been achieved in these areas.

  10. Computational systems biology in drug discovery and development: methods and applications.

    Science.gov (United States)

    Materi, Wayne; Wishart, David S

    2007-04-01

    Computational systems biology is an emerging field in biological simulation that attempts to model or simulate intra- and intercellular events using data gathered from genomic, proteomic or metabolomic experiments. The need to model complex temporal and spatiotemporal processes at many different scales has led to the emergence of numerous techniques, including systems of differential equations, Petri nets, cellular automata simulators, agent-based models and pi calculus. This review provides a brief summary and an assessment of most of these approaches. It also provides examples of how these methods are being used to facilitate drug discovery and development.

  11. Application of remote sensing information related to mineralization alteration and new discovery in geological prospecting work

    Science.gov (United States)

    Yao, F. J.; Fu, B. H.

    2017-07-01

    As an intuitive manner, Remote Sensing (RS) can be used in resource prospecting and it shows some effects in practice, but high resolution and accuracy still remain problematic because RS methods need to be improved. Based on a RS prospecting model, prospecting resolution and accuracy can be greatly elevated. Different resources need to build various RS prospecting models. Some targets were proposed using RS exploration model. According to field examination, new discoveries have been successfully obtained in hydrothermal metal mineral prospecting, sedimentary metal mineral prospecting. Therefore the RS becomes more effective method in prospecting.

  12. In-silico ADME models: a general assessment of their utility in drug discovery applications.

    Science.gov (United States)

    Gleeson, M Paul; Hersey, Anne; Hannongbua, Supa

    2011-01-01

    ADME prediction is an extremely challenging area as many of the properties we try to predict are a result of multiple physiological processes. In this review we consider how in-silico predictions of ADME processes can be used to help bias medicinal chemistry into more ideal areas of property space, minimizing the number of compounds needed to be synthesized to obtain the required biochemical/physico-chemical profile. While such models are not sufficiently accurate to act as a replacement for in-vivo or in-vitro methods, in-silico methods nevertheless can help us to understand the underlying physico-chemical dependencies of the different ADME properties, and thus can give us inspiration on how to optimize them. Many global in-silico ADME models (i.e generated on large, diverse datasets) have been reported in the literature. In this paper we selectively review representatives from each distinct class and discuss their relative utility in drug discovery. For each ADME parameter, we limit our discussion to the most recent, most predictive or most insightful examples in the literature to highlight the current state of the art. In each case we briefly summarize the different types of models available for each parameter (i.e simple rules, physico-chemical and 3D based QSAR predictions), their overall accuracy and the underlying SAR. We also discuss the utility of the models as related to lead generation and optimization phases of discovery research.

  13. Tango assay for ligand-induced GPCR-β-arrestin2 interaction: Application in drug discovery.

    Science.gov (United States)

    Dogra, Shalini; Sona, Chandan; Kumar, Ajeet; Yadav, Prem N

    2016-01-01

    G protein-coupled receptors (GPCRs) are widely known to modulate almost all physiological functions and have been demonstrated over the time as therapeutic targets for wide gamut of diseases. The design and implementation of high-throughput GPCR-based assays that permit the efficient screening of large compound libraries to discover novel drug candidates are essential for a successful drug discovery endeavor. Usually, GPCR-based functional assays depend primarily on the measurement of G protein-mediated second messenger generation. However, with advent of advanced molecular biology tools and increased understanding of GPCR signal transduction, many G protein-independent pathways such as β-arrestin translocation are being utilized to detect the activity of GPCRs. These assays provide additional information on functional selectivity (also known as biased agonism) of compounds that could be harnessed to develop pathway-selective drug candidates to reduce the adverse effects associated with given GPCR target. In this chapter, we describe the basic principle, detailed methodologies and assay setup, result analysis and data interpretations of the β-arrestin2 Tango assay, and its comparison with cell-based G protein-dependent GPCR assays, which could be employed in a simple academic setup to facilitate GPCR-based drug discovery.

  14. Prolonged application of high fluid shear to chondrocytes recapitulates gene expression profiles associated with osteoarthritis.

    Directory of Open Access Journals (Sweden)

    Fei Zhu

    Full Text Available BACKGROUND: Excessive mechanical loading of articular cartilage producing hydrostatic stress, tensile strain and fluid flow leads to irreversible cartilage erosion and osteoarthritic (OA disease. Since application of high fluid shear to chondrocytes recapitulates some of the earmarks of OA, we aimed to screen the gene expression profiles of shear-activated chondrocytes and assess potential similarities with OA chondrocytes. METHODOLOGY/PRINCIPAL FINDINGS: Using a cDNA microarray technology, we screened the differentially-regulated genes in human T/C-28a2 chondrocytes subjected to high fluid shear (20 dyn/cm(2 for 48 h and 72 h relative to static controls. Confirmation of the expression patterns of select genes was obtained by qRT-PCR. Using significance analysis of microarrays with a 5% false discovery rate, 71 and 60 non-redundant transcripts were identified to be ≥2-fold up-regulated and ≤0.6-fold down-regulated, respectively, in sheared chondrocytes. Published data sets indicate that 42 of these genes, which are related to extracellular matrix/degradation, cell proliferation/differentiation, inflammation and cell survival/death, are differentially-regulated in OA chondrocytes. In view of the pivotal role of cyclooxygenase-2 (COX-2 in the pathogenesis and/or progression of OA in vivo and regulation of shear-induced inflammation and apoptosis in vitro, we identified a collection of genes that are either up- or down-regulated by shear-induced COX-2. COX-2 and L-prostaglandin D synthase (L-PGDS induce reactive oxygen species production, and negatively regulate genes of the histone and cell cycle families, which may play a critical role in chondrocyte death. CONCLUSIONS/SIGNIFICANCE: Prolonged application of high fluid shear stress to chondrocytes recapitulates gene expression profiles associated with osteoarthritis. Our data suggest a potential link between exposure of chondrocytes/cartilage to abnormal mechanical loading and the pathogenesis

  15. Empirical Discovery in Linguistics

    CERN Document Server

    Pericliev, V

    1995-01-01

    A discovery system for detecting correspondences in data is described, based on the familiar induction methods of J. S. Mill. Given a set of observations, the system induces the ``causally'' related facts in these observations. Its application to empirical linguistic discovery is described.

  16. The principal and application of gene chips

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@Today we are living in an information era. Everyone can see what a tremendously change has been brought by the using of the computers. The advent of the computer chip let us embed our smarts in everything from satellite to greeting cards to internet.   People said that the human genome project is the second "Apollo". Human Genome Project, the international effort that is expected to unravel the structures of all 30 000 to 35 000 or so human genes by 2003. But deconstructing the genome is only the first step-like learning to pick out words in a foreign language before grappling with their meanings. Realizing the gene revolution's potential will require understanding how genes collaborate to cement memories in our brains, say, or how they malfunction to change a healthy adult into one dying of cancer. That's the goal of the second phase of the revolution, functional genomics.

  17. Unlocking biomarker discovery: large scale application of aptamer proteomic technology for early detection of lung cancer.

    Directory of Open Access Journals (Sweden)

    Rachel M Ostroff

    Full Text Available BACKGROUND: Lung cancer is the leading cause of cancer deaths worldwide. New diagnostics are needed to detect early stage lung cancer because it may be cured with surgery. However, most cases are diagnosed too late for curative surgery. Here we present a comprehensive clinical biomarker study of lung cancer and the first large-scale clinical application of a new aptamer-based proteomic technology to discover blood protein biomarkers in disease. METHODOLOGY/PRINCIPAL FINDINGS: We conducted a multi-center case-control study in archived serum samples from 1,326 subjects from four independent studies of non-small cell lung cancer (NSCLC in long-term tobacco-exposed populations. Sera were collected and processed under uniform protocols. Case sera were collected from 291 patients within 8 weeks of the first biopsy-proven lung cancer and prior to tumor removal by surgery. Control sera were collected from 1,035 asymptomatic study participants with ≥ 10 pack-years of cigarette smoking. We measured 813 proteins in each sample with a new aptamer-based proteomic technology, identified 44 candidate biomarkers, and developed a 12-protein panel (cadherin-1, CD30 ligand, endostatin, HSP90α, LRIG3, MIP-4, pleiotrophin, PRKCI, RGM-C, SCF-sR, sL-selectin, and YES that discriminates NSCLC from controls with 91% sensitivity and 84% specificity in cross-validated training and 89% sensitivity and 83% specificity in a separate verification set, with similar performance for early and late stage NSCLC. CONCLUSIONS/SIGNIFICANCE: This study is a significant advance in clinical proteomics in an area of high unmet clinical need. Our analysis exceeds the breadth and dynamic range of proteome interrogated of previously published clinical studies of broad serum proteome profiling platforms including mass spectrometry, antibody arrays, and autoantibody arrays. The sensitivity and specificity of our 12-biomarker panel improves upon published protein and gene expression panels

  18. Reliable knowledge discovery

    CERN Document Server

    Dai, Honghua; Smirnov, Evgueni

    2012-01-01

    Reliable Knowledge Discovery focuses on theory, methods, and techniques for RKDD, a new sub-field of KDD. It studies the theory and methods to assure the reliability and trustworthiness of discovered knowledge and to maintain the stability and consistency of knowledge discovery processes. RKDD has a broad spectrum of applications, especially in critical domains like medicine, finance, and military. Reliable Knowledge Discovery also presents methods and techniques for designing robust knowledge-discovery processes. Approaches to assessing the reliability of the discovered knowledge are introduc

  19. Baculovirus-mediated Gene Delivery and RNAi Applications

    Directory of Open Access Journals (Sweden)

    Kaisa-Emilia Makkonen

    2015-04-01

    Full Text Available Baculoviruses are widely encountered in nature and a great deal of data is available about their safety and biology. Recently, these versatile, insect-specific viruses have demonstrated their usefulness in various biotechnological applications including protein production and gene transfer. Multiple in vitro and in vivo studies exist and support their use as gene delivery vehicles in vertebrate cells. Recently, baculoviruses have also demonstrated high potential in RNAi applications in which several advantages of the virus make it a promising tool for RNA gene transfer with high safety and wide tropism.

  20. Genetic correction using engineered nucleases for gene therapy applications.

    Science.gov (United States)

    Li, Hongmei Lisa; Nakano, Takao; Hotta, Akitsu

    2014-01-01

    Genetic mutations in humans are associated with congenital disorders and phenotypic traits. Gene therapy holds the promise to cure such genetic disorders, although it has suffered from several technical limitations for decades. Recent progress in gene editing technology using tailor-made nucleases, such as meganucleases (MNs), zinc finger nucleases (ZFNs), TAL effector nucleases (TALENs) and, more recently, CRISPR/Cas9, has significantly broadened our ability to precisely modify target sites in the human genome. In this review, we summarize recent progress in gene correction approaches of the human genome, with a particular emphasis on the clinical applications of gene therapy.

  1. The novelty of phytofurans, isofurans, dihomo-isofurans and neurofurans: Discovery, synthesis and potential application.

    Science.gov (United States)

    Cuyamendous, Claire; de la Torre, Aurélien; Lee, Yiu Yiu; Leung, Kin Sum; Guy, Alexandre; Bultel-Poncé, Valérie; Galano, Jean-Marie; Lee, Jetty Chung-Yung; Oger, Camille; Durand, Thierry

    2016-11-01

    Polyunsaturated fatty acids (PUFA) are oxidized in vivo under oxidative stress through free radical pathway and release cyclic oxygenated metabolites, which are commonly classified as isoprostanes and isofurans. The discovery of isoprostanes goes back twenty-five years compared to fifteen years for isofurans, and great many are discovered. The biosynthesis, the nomenclature, the chemical synthesis of furanoids from α-linolenic acid (ALA, C18:3 n-3), arachidonic acid (AA, C20:4 n-6), adrenic acid (AdA, 22:4 n-6) and docosahexaenoic acid (DHA, 22:6 n-3) as well as their identification and implication in biological systems are highlighted in this review.

  2. Application of iChip to Grow "Uncultivable" Microorganisms and its Impact on Antibiotic Discovery.

    Science.gov (United States)

    Sherpa, Rinzhin T; Reese, Caretta J; Montazeri Aliabadi, Hamidreza

    2015-01-01

    Antibiotics have revolutionized modern medicine, allowing significant progress in healthcare and improvement in life expectancy. Development of antibiotic resistance by pathogenic bacteria is a natural phenomenon; however, the rate of antibiotic resistance emergence is increasing at an alarming rate, due to indiscriminate use of antibiotics in healthcare, agriculture and even everyday products. Traditionally, antibiotic discovery has been conducted by screening extracts of microorganisms for antimicrobial activity. However, this conventional source has been over-used to such an extent that it poses the risk of "running out" of new antibiotics. Aiming to increase access to a greater diversity of microorganisms, a new cultivation method with an in situ approach called iChip has been designed. The iChip has already isolated many novel organisms, as well as Teixobactin, a novel antibiotic with significant potency against gram-positive bacteria.

  3. A new wide field-of-view confocal imaging system and its applications in drug discovery and pathology

    Science.gov (United States)

    Li, Gang; Damaskinos, Savvas; Dixon, Arthur E.; Lee, Lucy E. J.

    2005-11-01

    Conventional widefield light microscopy and confocal scanning microscopy have been indispensable for pathology and drug discovery research. Clinical specimens from diseased tissues are examined, new drug candidates are tested on drug targets, and the morphological and molecular biological changes of cells and tissues are observed. High throughput screening of drug candidates requires highly efficient screening instruments. A standard biomedical slide is 1 by 3 inches (25.4 by 76.2 mm) in size. A typical tissue specimen is 10 mm in diameter. To form a high resolution image of the entire specimen, a conventional widefield light microscope must acquire a large number of small images of the specimen, and then tile them together, which is tedious, inefficient and error-prone. A patented new wide field-of-view confocal scanning laser imaging system has been developed for tissue imaging, which is capable of imaging an entire microscope slide without tiling. It is capable of operating in brightfield, reflection and epi-fluorescence imaging modes. Three (red, green and blue (RGB)) lasers are used to produce brightfield and reflection images, and to excite various fluorophores. This new confocal system makes examination of large biomedical specimens more efficient, and makes fluorescence examination of large specimens possible for the first time without tiling. Description of the new confocal technology and applications of the imaging system in pathology and drug discovery research, for example, imaging large tissue specimens, tissue microarrays, and zebrafish sections, are reported in this paper.

  4. Discovery Mondays

    CERN Multimedia

    2003-01-01

    Many people don't realise quite how much is going on at CERN. Would you like to gain first-hand knowledge of CERN's scientific and technological activities and their many applications? Try out some experiments for yourself, or pick the brains of the people in charge? If so, then the «Lundis Découverte» or Discovery Mondays, will be right up your street. Starting on May 5th, on every first Monday of the month you will be introduced to a different facet of the Laboratory. CERN staff, non-scientists, and members of the general public, everyone is welcome. So tell your friends and neighbours and make sure you don't miss this opportunity to satisfy your curiosity and enjoy yourself at the same time. You won't have to listen to a lecture, as the idea is to have open exchange with the expert in question and for each subject to be illustrated with experiments and demonstrations. There's no need to book, as Microcosm, CERN's interactive museum, will be open non-stop from 7.30 p.m. to 9 p.m. On the first Discovery M...

  5. Analysis of an inactive cyanobactin biosynthetic gene cluster leads to discovery of new natural products from strains of the genus Microcystis.

    Directory of Open Access Journals (Sweden)

    Niina Leikoski

    Full Text Available Cyanobactins are cyclic peptides assembled through the cleavage and modification of short precursor proteins. An inactive cyanobactin gene cluster has been described from the genome Microcystis aeruginosa NIES843. Here we report the discovery of active counterparts in strains of the genus Microcystis guided by this silent cyanobactin gene cluster. The end products of the gene clusters were structurally diverse cyclic peptides, which we named piricyclamides. Some of the piricyclamides consisted solely of proteinogenic amino acids while others contained disulfide bridges and some were prenylated or geranylated. The piricyclamide gene clusters encoded between 1 and 4 precursor genes. They encoded highly diverse core peptides ranging in length from 7-17 amino acids with just a single conserved amino acid. Heterologous expression of the pir gene cluster from Microcystis aeruginosa PCC7005 in Escherichia coli confirmed that this gene cluster is responsible for the biosynthesis of piricyclamides. Chemical analysis demonstrated that Microcystis strains could produce an array of piricyclamides some of which are geranylated or prenylated. The genetic diversity of piricyclamides in a bloom sample was explored and 19 different piricyclamide precursor genes were found. This study provides evidence for a stunning array of piricyclamides in Microcystis, a worldwide occurring bloom forming cyanobacteria.

  6. Android worksheet application based on discovery learning on students' achievement for vocational high school: Mechanical behavior of materials topics

    Science.gov (United States)

    Nanto, Dwi; Aini, Anisa Nurul; Mulhayatiah, Diah

    2017-05-01

    This research reports a study of student worksheet based on discovery learning on Mechanical Behavior of Materials topics under Android application (Android worksheet application) for vocational high school. The samples are Architecture class X students of SMKN 4 (a public vocational high school) in Tangerang Selatan City, province of Banten, Indonesia. We made 3 groups based on Intellectual Quotient (IQ). They are average IQ group, middle IQ group and high IQ group. The method of research is used as a quasi-experimental design with nonequivalent control group design. The technique of sampling is purposive sampling. Instruments used in this research are test instruments and non-test instruments. The test instruments are IQ test and test of student's achievement. For the test of student's achievement (pretest and posttest) we provide 25 multiple choice problems. The non-test instruments are questionnaire responses by the students and the teacher. Without IQ categorized, the result showed that there is an effect of Android worksheet application on student's achievement based on cognitive aspects of Revised Bloom's Taxonomy. However, from the IQ groups point of view, only the middle IQ group and the high IQ group showed a significant effect from the Android worksheet application on student's achievement meanwhile for the average IQ group there was no effect.

  7. MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to The Discovery of Organic Flow Battery Electrolytes

    CERN Document Server

    Kim, Sung-Jin; Aspuru-Guzik, Alán

    2016-01-01

    We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and the accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using \\emph{multiple-type - as opposed to single-type - descriptors}, more relevant features for machine learning can be obtained. Following the principle of the 'wisdom of the crowds', the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, MultiDK can exploit irregularities between molecular structure and property relations better than the linear regression method by employing multiple kernels - more than one kernel functions for a set of the input descriptors. The multiple kernels consist of the Tanimoto similarity function and a linear kernel for a set of binary descriptors and a set of non-binary descriptors, respectively. Using MultiDK, we...

  8. Robust statistical methods for significance evaluation and applications in cancer driver detection and biomarker discovery

    DEFF Research Database (Denmark)

    Madsen, Tobias

    2017-01-01

    are used to scale the aforementioned driver detection methods to a dataset consisting of more than 2,000 cancer genomes. The sizes and dimensionalities of genomic data sets, be it a large number of genes or multiple heterogeneous data sources, pose both great statistical opportunities and challenges....... This distribution can be learned across the entire set of genes and then be used to improve inference on the level of the individual gene. A practical way to implement this insight is using empirical Bayes. This idea is one of the main statistical underpinnings of the present work. The thesis consist of three main...... manuscripts as well as two supplementary manuscripts. In the first manuscript we explore efficient significance evaluation for models defined with factor graphs. Factor graphs are a class of graphical models encompassing both Bayesian networks and Markov models. We specifically develop a saddle...

  9. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT.

    Directory of Open Access Journals (Sweden)

    Tiffany A Timbers

    2016-08-01

    Full Text Available Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT, to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing, development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS. WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy.

  10. Guided Discoveries.

    Science.gov (United States)

    Ehrlich, Amos

    1991-01-01

    Presented are four mathematical discoveries made by students on an arithmetical function using the Fibonacci sequence. Discussed is the nature of the role of the teacher in directing the students' discovery activities. (KR)

  11. Volatility Discovery

    DEFF Research Database (Denmark)

    Dias, Gustavo Fruet; Scherrer, Cristina; Papailias, Fotis

    The price discovery literature investigates how homogenous securities traded on different markets incorporate information into prices. We take this literature one step further and investigate how these markets contribute to stochastic volatility (volatility discovery). We formally show...... that the realized measures from homogenous securities share a fractional stochastic trend, which is a combination of the price and volatility discovery measures. Furthermore, we show that volatility discovery is associated with the way that market participants process information arrival (market sensitivity...

  12. Application of genetic algorithm for discovery of core effective formulae in TCM clinical data.

    Science.gov (United States)

    Yang, Ming; Poon, Josiah; Wang, Shaomo; Jiao, Lijing; Poon, Simon; Cui, Lizhi; Chen, Peiqi; Sze, Daniel Man-Yuen; Xu, Ling

    2013-01-01

    Research on core and effective formulae (CEF) does not only summarize traditional Chinese medicine (TCM) treatment experience, it also helps to reveal the underlying knowledge in the formulation of a TCM prescription. In this paper, CEF discovery from tumor clinical data is discussed. The concepts of confidence, support, and effectiveness of the CEF are defined. Genetic algorithm (GA) is applied to find the CEF from a lung cancer dataset with 595 records from 161 patients. The results had 9 CEF with positive fitness values with 15 distinct herbs. The CEF have all had relative high average confidence and support. A herb-herb network was constructed and it shows that all the herbs in CEF are core herbs. The dataset was divided into CEF group and non-CEF group. The effective proportions of former group are significantly greater than those of latter group. A Synergy index (SI) was defined to evaluate the interaction between two herbs. There were 4 pairs of herbs with high SI values to indicate the synergy between the herbs. All the results agreed with the TCM theory, which demonstrates the feasibility of our approach.

  13. Non a Priori Automatic Discovery of 3D Chemical Patterns: Application to Mutagenicity.

    Science.gov (United States)

    Rabatel, Julien; Fannes, Thomas; Lepailleur, Alban; Le Goff, Jérémie; Crémilleux, Bruno; Ramon, Jan; Bureau, Ronan; Cuissart, Bertrand

    2017-06-07

    This article introduces a new type of structural fragment called a geometrical pattern. Such geometrical patterns are defined as molecular graphs that include a labelling of atoms together with constraints on interatomic distances. The discovery of geometrical patterns in a chemical dataset relies on the induction of multiple decision trees combined in random forests. Each computational step corresponds to a refinement of a preceding set of constraints, extending a previous geometrical pattern. This paper focuses on the mutagenicity of chemicals via the definition of structural alerts in relation with these geometrical patterns. It follows an experimental assessment of the main geometrical patterns to show how they can efficiently originate the definition of a chemical feature related to a chemical function or a chemical property. Geometrical patterns have provided a valuable and innovative approach to bring new pieces of information for discovering and assessing structural characteristics in relation to a particular biological phenotype. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. High-throughput screening normalized to biological response: application to antiviral drug discovery.

    Science.gov (United States)

    Patel, Dhara A; Patel, Anand C; Nolan, William C; Huang, Guangming; Romero, Arthur G; Charlton, Nichole; Agapov, Eugene; Zhang, Yong; Holtzman, Michael J

    2014-01-01

    The process of conducting cell-based phenotypic screens can result in data sets from small libraries or portions of large libraries, making accurate hit picking from multiple data sets important for efficient drug discovery. Here, we describe a screen design and data analysis approach that allow for normalization not only between quadrants and plates but also between screens or batches in a robust, quantitative fashion, enabling hit selection from multiple data sets. We independently screened the MicroSource Spectrum and NCI Diversity Set II libraries using a cell-based phenotypic high-throughput screening (HTS) assay that uses an interferon-stimulated response element (ISRE)-driven luciferase-reporter assay to identify interferon (IFN) signal enhancers. Inclusion of a per-plate, per-quadrant IFN dose-response standard curve enabled conversion of ISRE activity to effective IFN concentrations. We identified 45 hits based on a combined z score ≥2.5 from the two libraries, and 25 of 35 available hits were validated in a compound concentration-response assay when tested using fresh compound. The results provide a basis for further analysis of chemical structure in relation to biological function. Together, the results establish an HTS method that can be extended to screening for any class of compounds that influence a quantifiable biological response for which a standard is available.

  15. Sparse Inverse Gaussian Process Regression with Application to Climate Network Discovery

    Data.gov (United States)

    National Aeronautics and Space Administration — Regression problems on massive data sets are ubiquitous in many application domains including the Internet, earth and space sciences, and finances. Gaussian Process...

  16. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    Directory of Open Access Journals (Sweden)

    Wynhoven Brian

    2011-03-01

    Full Text Available Abstract Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt, we have generated Expressed Sequence Tags (ESTs by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores and asexual (germinated urediniospores stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum, 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs. Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt and stripe rust, P. striiformis f. sp

  17. A novel approach to the discovery of survival biomarkers in glioblastoma using a joint analysis of DNA methylation and gene expression.

    Science.gov (United States)

    Smith, Ashley A; Huang, Yen-Tsung; Eliot, Melissa; Houseman, E Andres; Marsit, Carmen J; Wiencke, John K; Kelsey, Karl T

    2014-06-01

    Glioblastoma multiforme (GBM) is the most aggressive of all brain tumors, with a median survival of less than 1.5 years. Recently, epigenetic alterations were found to play key roles in both glioma genesis and clinical outcome, demonstrating the need to integrate genetic and epigenetic data in predictive models. To enhance current models through discovery of novel predictive biomarkers, we employed a genome-wide, agnostic strategy to specifically capture both methylation-directed changes in gene expression and alternative associations of DNA methylation with disease survival in glioma. Human GBM-associated DNA methylation, gene expression, IDH1 mutation status, and survival data were obtained from The Cancer Genome Atlas. DNA methylation loci and expression probes were paired by gene, and their subsequent association with survival was determined by applying an accelerated failure time model to previously published alternative and expression-based association equations. Significant associations were seen in 27 unique methylation/expression pairs with expression-based, alternative, and combinatorial associations observed (10, 13, and 4 pairs, respectively). The majority of the predictive DNA methylation loci were located within CpG islands, and all but three of the locus pairs were negatively correlated with survival. This finding suggests that for most loci, methylation/expression pairs are inversely related, consistent with methylation-associated gene regulatory action. Our results indicate that changes in DNA methylation are associated with altered survival outcome through both coordinated changes in gene expression and alternative mechanisms. Furthermore, our approach offers an alternative method of biomarker discovery using a priori gene pairing and precise targeting to identify novel sites for locus-specific therapeutic intervention.

  18. Clustering Algorithms: Their Application to Gene Expression Data

    Science.gov (United States)

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  19. On reliable discovery of molecular signatures

    Directory of Open Access Journals (Sweden)

    Björkegren Johan

    2009-01-01

    Full Text Available Abstract Background Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability. Results We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings. Conclusion Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.

  20. Knowledge discovery in databases of biomechanical variables: application to the sit to stand motor task

    Directory of Open Access Journals (Sweden)

    Benvenuti Francesco

    2004-10-01

    Full Text Available Abstract Background The interpretation of data obtained in a movement analysis laboratory is a crucial issue in clinical contexts. Collection of such data in large databases might encourage the use of modern techniques of data mining to discover additional knowledge with automated methods. In order to maximise the size of the database, simple and low-cost experimental set-ups are preferable. The aim of this study was to extract knowledge inherent in the sit-to-stand task as performed by healthy adults, by searching relationships among measured and estimated biomechanical quantities. An automated method was applied to a large amount of data stored in a database. The sit-to-stand motor task was already shown to be adequate for determining the level of individual motor ability. Methods The technique of search for association rules was chosen to discover patterns as part of a Knowledge Discovery in Databases (KDD process applied to a sit-to-stand motor task observed with a simple experimental set-up and analysed by means of a minimum measured input model. Selected parameters and variables of a database containing data from 110 healthy adults, of both genders and of a large range of age, performing the task were considered in the analysis. Results A set of rules and definitions were found characterising the patterns shared by the investigated subjects. Time events of the task turned out to be highly interdependent at least in their average values, showing a high level of repeatability of the timing of the performance of the task. Conclusions The distinctive patterns of the sit-to-stand task found in this study, associated to those that could be found in similar studies focusing on subjects with pathologies, could be used as a reference for the functional evaluation of specific subjects performing the sit-to-stand motor task.

  1. [Ribozyme riboswitch based gene expression regulation systems for gene therapy applications: progress and challenges].

    Science.gov (United States)

    Feng, Jing-Xian; Wang, Jia-wen; Lin, Jun-sheng; Diao, Yong

    2014-11-01

    Robust and efficient control of therapeutic gene expression is needed for timing and dosing of gene therapy drugs in clinical applications. Ribozyme riboswitch provides a promising building block for ligand-controlled gene-regulatory system, based on its property that exhibits tunable gene regulation, design modularity, and target specificity. Ribozyme riboswitch can be used in various gene delivery vectors. In recent years, there have been breakthroughs in extending ribozyme riboswitch's application from gene-expression control to cellular function and fate control. High throughput screening platforms were established, that allow not only rapid optimization of ribozyme riboswitch in a microbial host, but also straightforward transfer of selected devices exhibiting desired activities to mammalian cell lines in a predictable manner. Mathematical models were employed successfully to explore the performance of ribozyme riboswitch quantitively and its rational design predictably. However, to progress toward gene therapy relevant applications, both precision rational design of regulatory circuits and the biocompatibility of regulatory ligand are still of crucial importance.

  2. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    Science.gov (United States)

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.

  3. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    Science.gov (United States)

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  4. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    Science.gov (United States)

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  5. Nanotechnology in Drug Delivery and Tissue Engineering: From Discovery to Applications

    Science.gov (United States)

    Shi, Jinjun; Votruba, Alexander R.; Farokhzad, Omid C.; Langer, Robert

    2010-01-01

    The application of nanotechnology in medicine, referred to as nanomedicine, is offering numerous exciting possibilities in healthcare. Herein, we discuss two important aspects of nanomedicine—drug delivery and tissue engineering—highlighting the advances we have recently experienced, the challenges we are currently facing, and what we are likely to witness in the near future. PMID:20726522

  6. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome

    Directory of Open Access Journals (Sweden)

    Crooijmans Richard PMA

    2009-08-01

    Full Text Available Abstract Background Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of true SNPs on a large scale. Results DNA pooled from five animals from a commercial boar line was digested with DraI; 150–250-bp fragments were isolated and end-sequenced using the Illumina 1 G Genome Analyzer, yielding 70,348,064 sequences 36-bp long. Rules were developed to select sequences, which were then aligned to unique positions in a reference genome. Sequences were selected based on quality, and three thresholds of sequence quality (SQ were compared. The highest threshold of SQ allowed identification of a larger number of SNPs (17,489, distributed widely across the pig genome. In total, 3,142 SNPs were validated with a success rate of 96%. The correlation between estimated minor allele frequency (MAF and genotyped MAF was moderate, and SNPs were highly polymorphic in other pig breeds. Lowering the SQ threshold and maintaining the same criteria for SNP identification resulted in the discovery of fewer SNPs (16,768, of which 259 were not identified using higher SQ levels. Validation of SNPs found exclusively in the lower SQ threshold had a success rate of 94% and a low correlation between estimated MAF and genotyped MAF. Base change analysis suggested that the rate of transitions in the pig genome is likely to be similar to that observed in humans. Chromosome X showed reduced nucleotide diversity relative to autosomes, as observed for other species. Conclusion Large numbers of SNPs can be identified reliably by creating strict rules for sequence selection, which simultaneously decreases sequence ambiguity. Selection of sequences using a higher SQ

  7. An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

    Directory of Open Access Journals (Sweden)

    Taneda Akito

    2008-12-01

    Full Text Available Abstract Background Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA discovery. Results We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared S. cerevisiae genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%. By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences. Conclusion The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.

  8. Application of Rough Sets in Knowledge Discovery%Rough集在知识发现中的应用

    Institute of Scientific and Technical Information of China (English)

    杨秀芳

    2013-01-01

    Rough集理论作为一种处理不精确、不一致、不完整等各种不完备信息的有效工具,一方面得益于它的数学基础成熟且不需要先验知识;另一方面在于它的易用性。由于Rough集理论创建的目的和研究的出发点就是直接对数据进行分析和推理,从中发现隐含的知识,揭示潜在的规律,因此它是一种天然的知识发现或数据挖掘方法。本文对Rough集的基本概念和它在知识发现中的应用作了简单的介绍。%Rough set theory as the effective tool for imprecise treatment, inconsistent, incomplete, and other incomplete informations, on the one hand it thanks to its sophisticated mathematical basis and no need for prior knowledge; on the other hand it is ease for use. Since the purpose of creating Rough set theory and the starting point of research is to directly conduct analysis and reasoning on the data, find hidden knowledge and reveal the potential of the law, so it is a natural knowledge discovery or data mining method. In this paper, the basic concept of Rough sets snd its application in knowledge discovery are introduced briefly.

  9. Duchenne muscular dystrophy drug discovery - the application of utrophin promoter activation screening.

    Science.gov (United States)

    Moorwood, Catherine; Khurana, Tejvir S

    2013-05-01

    Duchenne muscular dystrophy (DMD) is a devastating genetic muscle wasting disease caused by mutations in the DMD gene that in turn lead to an absence of dystrophin. Currently, there is no definitive therapy for DMD. Gene- and cell-based therapies designed to replace dystrophin have met some degree of success, as have strategies that seek to improve the dystrophic pathology independent of dystrophin. In this review the authors focus on utrophin promoter activation-based strategies and their implications on potential therapeutics for DMD. These strategies in common are designed to identify drugs/small molecules that can activate the utrophin promoter and would allow the functional substitution of dystrophin by upregulating utrophin expression in dystrophic muscle. The authors provide an overview of utrophin biology with a focus on regulation of the utrophin promoter and discuss current attempts in identifying utrophin promoter-activating molecules using high-throughput screening (HTS). The characterisation of utrophin promoter regulatory mechanisms coupled with advances in HTS have allowed researchers to undertake screens and identify a number of promising lead compounds that may prove useful for DMD. In principle, these pharmacological compounds offer significant advantages from a translational viewpoint for developing DMD therapeutics.

  10. Discovery of a 29-gene panel in peripheral blood mononuclear cells for the detection of colorectal cancer and adenomas using high throughput real-time PCR.

    Science.gov (United States)

    Ciarloni, Laura; Hosseinian, Sahar; Monnier-Benoit, Sylvain; Imaizumi, Natsuko; Dorta, Gian; Ruegg, Curzio

    2015-01-01

    Colorectal cancer (CRC) is the second leading cause of cancer-related death in developed countries. Early detection of CRC leads to decreased CRC mortality. A blood-based CRC screening test is highly desirable due to limited invasiveness and high acceptance rate among patients compared to currently used fecal occult blood testing and colonoscopy. Here we describe the discovery and validation of a 29-gene panel in peripheral blood mononuclear cells (PBMC) for the detection of CRC and adenomatous polyps (AP). Blood samples were prospectively collected from a multicenter, case-control clinical study. First, we profiled 93 samples with 667 candidate and 3 reference genes by high throughput real-time PCR (OpenArray system). After analysis, 160 genes were retained and tested again on 51 additional samples. Low expressed and unstable genes were discarded resulting in a final dataset of 144 samples profiled with 140 genes. To define which genes, alone or in combinations had the highest potential to discriminate AP and/or CRC from controls, data were analyzed by a combination of univariate and multivariate methods. A list of 29 potentially discriminant genes was compiled and evaluated for its predictive accuracy by penalized logistic regression and bootstrap. This method discriminated AP >1cm and CRC from controls with a sensitivity of 59% and 75%, respectively, with 91% specificity. The behavior of the 29-gene panel was validated with a LightCycler 480 real-time PCR platform, commonly adopted by clinical laboratories. In this work we identified a 29-gene panel expressed in PBMC that can be used for developing a novel minimally-invasive test for accurate detection of AP and CRC using a standard real-time PCR platform.

  11. Design and Application of an Intelligent Agent for Web Information Discovery

    Institute of Scientific and Technical Information of China (English)

    闵君; 冯珊; 唐超; 许立达

    2003-01-01

    With the propagation of applications on the internet, the internet has become a great information source which supplies users with valuable information. But it is hard for users to quickly acquire the right information on the web. This paper an intelligent agent for internet applications to retrieve and extract web information under user's guidance. The intelligent agent is made up of a retrieval script to identify web sources, an extraction script based on the document object model to express extraction process, a data translator to export the extracted information into knowledge bases with frame structures, and a data reasoning to reply users' questions. A GUI tool named Script Writer helps to generate the extraction script visually, and knowledge rule databases help to extract wanted information and to generate the answer to questions.

  12. Discovery, Molecular Mechanisms, and Industrial Applications of Cold-Active Enzymes

    Science.gov (United States)

    Santiago, Margarita; Ramírez-Sarmiento, César A.; Zamora, Ricardo A.; Parra, Loreto P.

    2016-01-01

    Cold-active enzymes constitute an attractive resource for biotechnological applications. Their high catalytic activity at temperatures below 25°C makes them excellent biocatalysts that eliminate the need of heating processes hampering the quality, sustainability, and cost-effectiveness of industrial production. Here we provide a review of the isolation and characterization of novel cold-active enzymes from microorganisms inhabiting different environments, including a revision of the latest techniques that have been used for accomplishing these paramount tasks. We address the progress made in the overexpression and purification of cold-adapted enzymes, the evolutionary and molecular basis of their high activity at low temperatures and the experimental and computational techniques used for their identification, along with protein engineering endeavors based on these observations to improve some of the properties of cold-adapted enzymes to better suit specific applications. We finally focus on examples of the evaluation of their potential use as biocatalysts under conditions that reproduce the challenges imposed by the use of solvents and additives in industrial processes and of the successful use of cold-adapted enzymes in biotechnological and industrial applications. PMID:27667987

  13. Discovery, Molecular Mechanisms and Industrial Applications of Cold-Active Enzymes

    Directory of Open Access Journals (Sweden)

    Margarita Santiago

    2016-09-01

    Full Text Available Cold-active enzymes constitute an attractive resource for biotechnological applications. Their high catalytic activity at temperatures below 25 ºC makes them excellent biocatalysts that eliminate the need of heating processes hampering the quality, sustainability and cost-effectiveness of industrial production. Here we provide a review of the isolation and characterization of novel cold-active enzymes from microorganisms inhabiting different environments, including a revision of the latest techniques that have been used for accomplishing these paramount tasks. We address the progress made in the overexpression and purification of cold-adapted enzymes, the evolutionary and molecular basis of their high activity at low temperatures and the experimental and computational techniques used for their identification, along with protein engineering endeavors based on these observations to improve some of the properties of cold-adapted enzymes to better suit specific applications. We finally focus on examples of the evaluation of their potential use as biocatalysts under conditions that reproduce the challenges imposed by the use of solvents and additives in industrial processes and of the successful use of cold-adapted enzymes in biotechnological and industrial applications.

  14. [Functions of prostaglandin receptors in contact dermatitis and application to drug discovery].

    Science.gov (United States)

    Morimoto, Kazushi; Tsuchiya, Soken; Sugimoto, Yukihiko

    2012-01-01

    Contact dermatitis is an inflammatory skin disease caused by toxic factors that activate the skin innate immunity (irritant contact dermatitis) or by a T cell-mediated hypersensitivity reaction (allergic contact dermatitis). These inflammatory skin diseases are sometimes still not easy to control. Therefore, the development of new effective drugs with fewer side effects is anticipated. In the skin under pathophysiological conditions, multiple prostaglandins are produced and their receptors are expressed in time- and/or cell-dependent manners. However, the precise role of prostaglandins and their receptors in contact dermatitis has not been fully understood. Recently, studies using mice with a disruption of each prostaglandin receptor gene, as well as receptor-selective compounds revealed that prostaglandin receptors have manifold functions, sometimes resulting in opposite outcomes. Here, we review new advances in the roles of prostaglandin receptors in contact hypersensitivity as a cutaneous immune response model, and also discuss the clinical potentials of receptor-selective drugs.

  15. Theoretical modeling of masking DNA application in aptamer-facilitated biomarker discovery.

    Science.gov (United States)

    Cherney, Leonid T; Obrecht, Natalia M; Krylov, Sergey N

    2013-04-16

    In aptamer-facilitated biomarker discovery (AptaBiD), aptamers are selected from a library of random DNA (or RNA) sequences for their ability to specifically bind cell-surface biomarkers. The library is incubated with intact cells, and cell-bound DNA molecules are separated from those unbound and amplified by the polymerase chain reaction (PCR). The partitioning/amplification cycle is repeated multiple times while alternating target cells and control cells. Efficient aptamer selection in AptaBiD relies on the inclusion of masking DNA within the cell and library mixture. Masking DNA lacks primer regions for PCR amplification and is typically taken in excess to the library. The role of masking DNA within the selection mixture is to outcompete any nonspecific binding sequences within the initial library, thus allowing specific DNA sequences (i.e., aptamers) to be selected more efficiently. Efficient AptaBiD requires an optimum ratio of masking DNA to library DNA, at which aptamers still bind specific binding sites but nonaptamers within the library do not bind nonspecific binding sites. Here, we have developed a mathematical model that describes the binding processes taking place within the equilibrium mixture of masking DNA, library DNA, and target cells. An obtained mathematical solution allows one to estimate the concentration of masking DNA that is required to outcompete the library DNA at a desirable ratio of bound masking DNA to bound library DNA. The required concentration depends on concentrations of the library and cells as well as on unknown cell characteristics. These characteristics include the concentration of total binding sites on the cell surface, N, and equilibrium dissociation constants, K(nsL) and K(nsM), for nonspecific binding of the library DNA and masking DNA, respectively. We developed a theory that allows the determination of N, K(nsL), and K(nsM) based on measurements of EC50 values for cells mixed separately with the library and masking DNA

  16. Discovery and identification of candidate sex-related genes based on transcriptome sequencing of Russian sturgeon (Acipenser gueldenstaedtii) gonads.

    Science.gov (United States)

    Chen, Yadong; Xia, Yongtao; Shao, Changwei; Han, Lei; Chen, Xuejie; Yu, Mengjun; Sha, Zhenxia

    2016-07-01

    As the Russian sturgeon (Acipenser gueldenstaedtii) is an important food and is the main source of caviar, it is necessary to discover the genes associated with its sex differentiation. However, the complicated life and maturity cycles of the Russian sturgeon restrict the accurate identification of sex in early development. To generate a first look at specific sex-related genes, we sequenced the transcriptome of gonads in different development stages (1, 2, and 5 yr old stages) with next-generation RNA sequencing. We generated >60 million raw reads, and the filtered reads were assembled into 263,341 contigs, which produced 38,505 unigenes. Genes involved in signal transduction mechanisms were the most abundant, suggesting that development of sturgeon gonads is under control of signal transduction mechanisms. Differentially expressed gene analysis suggests that more genes for protein synthesis, cytochrome c oxidase subunits, and ribosomal proteins were expressed in female gonads than in male. Meanwhile, male gonads expressed more transposable element transposase, reverse transcriptase, and transposase-related genes than female. In total, 342, 782, and 7,845 genes were detected in intersex, male, and female transcriptomes, respectively. The female gonad expressed more genes than the male gonad, and more genes were involved in female gonadal development. Genes (sox9, foxl2) are differentially expressed in different sexes and may be important sex-related genes in Russian sturgeon. Sox9 genes are responsible for the development of male gonads and foxl2 for female gonads.

  17. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery.

    Science.gov (United States)

    Gonzalez, Graciela H; Tahsin, Tasnia; Goodale, Britton C; Greene, Anna C; Greene, Casey S

    2016-01-01

    Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine.

  18. Chitosan for gene delivery and orthopedic tissue engineering applications.

    Science.gov (United States)

    Raftery, Rosanne; O'Brien, Fergal J; Cryan, Sally-Ann

    2013-05-15

    Gene therapy involves the introduction of foreign genetic material into cells in order exert a therapeutic effect. The application of gene therapy to the field of orthopaedic tissue engineering is extremely promising as the controlled release of therapeutic proteins such as bone morphogenetic proteins have been shown to stimulate bone repair. However, there are a number of drawbacks associated with viral and synthetic non-viral gene delivery approaches. One natural polymer which has generated interest as a gene delivery vector is chitosan. Chitosan is biodegradable, biocompatible and non-toxic. Much of the appeal of chitosan is due to the presence of primary amine groups in its repeating units which become protonated in acidic conditions. This property makes it a promising candidate for non-viral gene delivery. Chitosan-based vectors have been shown to transfect a number of cell types including human embryonic kidney cells (HEK293) and human cervical cancer cells (HeLa). Aside from its use in gene delivery, chitosan possesses a range of properties that show promise in tissue engineering applications; it is biodegradable, biocompatible, has anti-bacterial activity, and, its cationic nature allows for electrostatic interaction with glycosaminoglycans and other proteoglycans. It can be used to make nano- and microparticles, sponges, gels, membranes and porous scaffolds. Chitosan has also been shown to enhance mineral deposition during osteogenic differentiation of MSCs in vitro. The purpose of this review is to critically discuss the use of chitosan as a gene delivery vector with emphasis on its application in orthopedic tissue engineering.

  19. Developing a Data Discovery Tool for Interdisciplinary Science: Leveraging a Web-based Mapping Application and Geosemantic Searching

    Science.gov (United States)

    Albeke, S. E.; Perkins, D. G.; Ewers, S. L.; Ewers, B. E.; Holbrook, W. S.; Miller, S. N.

    2015-12-01

    The sharing of data and results is paramount for advancing scientific research. The Wyoming Center for Environmental Hydrology and Geophysics (WyCEHG) is a multidisciplinary group that is driving scientific breakthroughs to help manage water resources in the Western United States. WyCEHG is mandated by the National Science Foundation (NSF) to share their data. However, the infrastructure from which to share such diverse, complex and massive amounts of data did not exist within the University of Wyoming. We developed an innovative framework to meet the data organization, sharing, and discovery requirements of WyCEHG by integrating both open and closed source software, embedded metadata tags, semantic web technologies, and a web-mapping application. The infrastructure uses a Relational Database Management System as the foundation, providing a versatile platform to store, organize, and query myriad datasets, taking advantage of both structured and unstructured formats. Detailed metadata are fundamental to the utility of datasets. We tag data with Uniform Resource Identifiers (URI's) to specify concepts with formal descriptions (i.e. semantic ontologies), thus allowing users the ability to search metadata based on the intended context rather than conventional keyword searches. Additionally, WyCEHG data are geographically referenced. Using the ArcGIS API for Javascript, we developed a web mapping application leveraging database-linked spatial data services, providing a means to visualize and spatially query available data in an intuitive map environment. Using server-side scripting (PHP), the mapping application, in conjunction with semantic search modules, dynamically communicates with the database and file system, providing access to available datasets. Our approach provides a flexible, comprehensive infrastructure from which to store and serve WyCEHG's highly diverse research-based data. This framework has not only allowed WyCEHG to meet its data stewardship

  20. Challenges in medical applications of whole exome/genome sequencing discoveries.

    Science.gov (United States)

    Marian, Ali J

    2012-11-01

    Despite the well-documented influence of genetics on susceptibility to cardiovascular diseases, delineation of the full spectrum of the risk alleles had to await the development of modern next-generation sequencing technologies. The techniques provide unbiased approaches for identification of the DNA sequence variants (DSVs) in the entire genome (whole genome sequencing [WGS]) or the protein-coding exons (whole exome sequencing [WES]). Each genome contains approximately 4 million DSVs and each exome approximately 13,000 single nucleotide variants. The challenge facing researchers and clinicians alike is to decipher the biological and clinical significance of these variants and harness the information for the practice of medicine. The common DSVs typically exert modest effect sizes, as evidenced by the results of genome-wide association studies, and hence have modest or negligible clinical implications. The focus is on the rare variants with large effect sizes, which are expected to have stronger clinical implications, as in single gene disorders with Mendelian patterns of inheritance. However, the clinical implications of the rare variants for common complex cardiovascular diseases remain to be established. The most important contribution of WES or WGS is in delineation of the novel molecular pathways involved in the pathogenesis of the phenotype, which would be expected to provide for preventive and therapeutic opportunities.

  1. Chemoinformatics and Drug Discovery

    Directory of Open Access Journals (Sweden)

    Arnold Hagler

    2002-08-01

    Full Text Available This article reviews current achievements in the field of chemoinformatics and their impact on modern drug discovery processes. The main data mining approaches used in cheminformatics, such as descriptor computations, structural similarity matrices, and classification algorithms, are outlined. The applications of cheminformatics in drug discovery, such as compound selection, virtual library generation, virtual high throughput screening, HTS data mining, and in silico ADMET are discussed. At the conclusion, future directions of chemoinformatics are suggested.

  2. Discovery, application and protein engineering of Baeyer-Villiger monooxygenases for organic synthesis.

    Science.gov (United States)

    Balke, Kathleen; Kadow, Maria; Mallin, Hendrik; Sass, Stefan; Bornscheuer, Uwe T

    2012-08-21

    Baeyer-Villiger monooxygenases (BVMOs) are useful enzymes for organic synthesis as they enable the direct and highly regio- and stereoselective oxidation of ketones to esters or lactones simply with molecular oxygen. This contribution covers novel concepts such as searching in protein sequence databases using distinct motifs to discover new Baeyer-Villiger monooxygenases as well as high-throughput assays to facilitate protein engineering in order to improve BVMOs with respect to substrate range, enantioselectivity, thermostability and other properties. Recent examples for the application of BVMOs in synthetic organic synthesis illustrate the broad potential of these biocatalysts. Furthermore, methods to facilitate the more efficient use of BVMOs in organic synthesis by applying e.g. improved cofactor regeneration, substrate feed and in situ product removal or immobilization are covered in this perspective.

  3. 知识发现及其应用研究回顾%An Overview of Knowledge Discovery and Its Application

    Institute of Scientific and Technical Information of China (English)

    黄绍君; 杨炳儒; 谢永红

    2001-01-01

    This paper first introduces the background of knowledge discovery and depicts the development of it as well as the knowledge type and database used in knowledge discovery. After that it introduces the application of knowledge discovery in various areas: agriculture, medicine, environmental protection, chronometer, finance, retail, military and Internet.%介绍了知识发现的背景,描述了知识发现的发展过程、知识类型、所使用的数据库,重点介绍了知识发现在各个领域的应用如:农业、医学、环保、天文、金融、零售、军事、Internet等。

  4. Induced Pluripotent Stem Cells: Applications in regenerative medicine, disease modelling and drug discovery

    Directory of Open Access Journals (Sweden)

    Vimal kishor Singh

    2015-02-01

    Full Text Available Recent progresses in the field of Induced Pluripotent Stem Cells (iPSCs have opened up many gateways for the research in therapeutics. iPSCs are the cells which are reprogrammed from somatic cells using different transcription factors. IPSCs possess unique properties of self renewal and differentiation to many types of cell lineage. Hence could replace the use of embryonic stem cells, and may overcome the various ethical issues regarding the use of embryos in research and clinics. Overwhelming responses prompted worldwide by a large number of researchers about the use of iPSCs evoked a large number of peple to establish more authentic methods for iPSC generation. This would require understanding the underlying mechanism in a detailed manner. There have been a large number of reports showing potential role of different molecules as putative regulators of iPSC generating methods. The molecular mechanisms that play role in reprogramming to generate iPSCs from different types of somatic cell sources involves a plethora of molecules including miRNAs, DNA modifying agents (viz. DNA methyl transferases, NANOG, etc. While promising a number of important roles in various clinical/research studies, iPSCs could also be of great use in studying molecular mechanism of many diseases. There are various diseases that have been modelled by uing iPSCs for better understanding of their etiology which maybe further utilized for developing putative treatments for these diseases. In addition, iPSCs are used for the production of patient-specific cells which can be transplanted to the site of injury or the site of tissue degeneration due to various disease conditions. The use of iPSCs may eliminate the chances of immune rejection as patient specific cells may be used for transplantation in various engraftment processes. Moreover, iPSC technology has been employed in various diseases for disease modelling and gene therapy. The technique offers benefits over other

  5. Automation of a phospho-STAT5 staining procedure for flow cytometry for application in drug discovery

    NARCIS (Netherlands)

    Malergue, Fabrice; van Agthoven, Andreas; Scifo, Caroline; Egan, Dave; Strous, Ger J

    2015-01-01

    Drug discovery often requires the screening of compound libraries on tissue cultured cells. Some major targets in drug discovery belong to signal transduction pathways, and PerFix EXPOSE* allows easy flow cytometry phospho assays. We thus investigated the possibility to further simplify and automate

  6. American Society of Gene & Cell Therapy

    Science.gov (United States)

    ... agencies, foundations, biotechnology and pharmaceutical companies. Mission: To advance knowledge, awareness, and education leading to the discovery and clinical application of gene and cell therapies to alleviate human disease. Vision: ASGCT will serve ...

  7. Applications of Fusion Energy Sciences Research - Scientific Discoveries and New Technologies Beyond Fusion

    Energy Technology Data Exchange (ETDEWEB)

    Wendt, Amy [Univ. of Wisconsin, Madison, WI (United States); Callis, Richard [General Atomics, San Diego, CA (United States); Efthimion, Philip [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Foster, John [Univ. of Michigan, Ann Arbor, MI (United States); Keane, Christopher [Washington State Univ., Pullman, WA (United States); Onsager, Terry [National Oceanic and Atmospheric Administration (NOAA), Boulder, CO (United States); O' Shea, Patrick [Univ. of Maryland, College Park, MD (United States)

    2015-09-01

    Since the 1950s, scientists and engineers in the U.S. and around the world have worked hard to make an elusive goal to be achieved on Earth: harnessing the reaction that fuels the stars, namely fusion. Practical fusion would be a source of energy that is unlimited, safe, environmentally benign, available to all nations and not dependent on climate or the whims of the weather. Significant resources, most notably from the U.S. Department of Energy (DOE) Office of Fusion Energy Sciences (FES), have been devoted to pursuing that dream, and significant progress is being made in turning it into a reality. However, that is only part of the story. The process of creating a fusion-based energy supply on Earth has led to technological and scientific achievements of far-reaching impact that touch every aspect of our lives. Those largely unanticipated advances, spanning a wide variety of fields in science and technology, are the focus of this report. There are many synergies between research in plasma physics (the study of charged particles and fluids interacting with self-consistent electric and magnetic fields), high-energy physics, and condensed matter physics dating back many decades. For instance, the formulation of a mathematical theory of solitons, solitary waves which are seen in everything from plasmas to water waves to Bose-Einstein Condensates, has led to an equal span of applications, including the fields of optics, fluid mechanics and biophysics. Another example, the development of a precise criterion for transition to chaos in Hamiltonian systems, has offered insights into a range of phenomena including planetary orbits, two-person games and changes in the weather. Seven distinct areas of fusion energy sciences were identified and reviewed which have had a recent impact on fields of science, technology and engineering not directly associated with fusion energy: Basic plasma science; Low temperature plasmas; Space and astrophysical plasmas; High energy density

  8. Applications of Fusion Energy Sciences Research - Scientific Discoveries and New Technologies Beyond Fusion

    Energy Technology Data Exchange (ETDEWEB)

    Wendt, Amy [Univ. of Wisconsin, Madison, WI (United States); Callis, Richard [General Atomics, San Diego, CA (United States); Efthimion, Philip [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Foster, John [Univ. of Michigan, Ann Arbor, MI (United States); Keane, Christopher [Washington State Univ., Pullman, WA (United States); Onsager, Terry [National Oceanic and Atmospheric Administration (NOAA), Boulder, CO (United States); O' Shea, Patrick [Univ. of Maryland, College Park, MD (United States)

    2015-09-01

    Since the 1950s, scientists and engineers in the U.S. and around the world have worked hard to make an elusive goal to be achieved on Earth: harnessing the reaction that fuels the stars, namely fusion. Practical fusion would be a source of energy that is unlimited, safe, environmentally benign, available to all nations and not dependent on climate or the whims of the weather. Significant resources, most notably from the U.S. Department of Energy (DOE) Office of Fusion Energy Sciences (FES), have been devoted to pursuing that dream, and significant progress is being made in turning it into a reality. However, that is only part of the story. The process of creating a fusion-based energy supply on Earth has led to technological and scientific achievements of far-reaching impact that touch every aspect of our lives. Those largely unanticipated advances, spanning a wide variety of fields in science and technology, are the focus of this report. There are many synergies between research in plasma physics, (the study of charged particles and fluids interacting with self-consistent electric and magnetic fields), high-energy physics, and condensed matter physics dating back many decades. For instance, the formulation of a mathematical theory of solitons, solitary waves which are seen in everything from plasmas to water waves to Bose-Einstein Condensates, has led to an equal span of applications, including the fields of optics, fluid mechanics and biophysics. Another example, the development of a precise criterion for transition to chaos in Hamiltonian systems, has offered insights into a range of phenomena including planetary orbits, two-person games and changes in the weather. Seven distinct areas of fusion energy sciences were identified and reviewed which have had a recent impact on fields of science, technology and engineering not directly associated with fusion energy: Basic plasma science; Low temperature plasmas; Space and astrophysical plasmas; High energy density

  9. The Combined C—H Functionalization/Cope Rearrangement: Discovery and Applications in Organic Synthesis

    Science.gov (United States)

    Davies, Huw M. L.; Lian, Yajing

    2012-01-01

    Conspectus The development of methods for the stereoselective functionalization of sp3 C–H bonds is a challenging undertaking. This Account describes the scope of the combined C–H functionalization/Cope rearrangement (CHCR), a reaction that occurs between rhodium-stabilized vinylcarbenoids and substrates containing allylic C–H bonds. Computational studies have shown that the CHCR reaction is initiated by a hydride transfer to the carbenoid from an allyl site of the substrate, which is then rapidly followed by C–C bond formation between the developing rhodium-bound allyl anion and the allyl cation. In principle, the reaction can proceed through four distinct orientations of the vinylcarbenoid and the approaching substrate. The early examples of the CHCR reaction were all highly diastereoselective, consistent with a reaction proceeding via a chair transition state with the vinylcarbenoid adopting an s-cis conformation. Recent computational studies have revealed that other transition state orientations are energetically accessible, and these results have guided the development of highly stereoselective CHCR reactions that proceed through a boat transition state with the vinylcarbenoid in an s-cis configuration. The CHCR reaction has broad applications in organic synthesis. In some new protocols, the CHCR reaction acts as a surrogate to some of the classic synthetic strategies in organic chemistry. The CHCR reaction has served as a synthetic equivalent of the Michael reaction, the vinylogous Mukaiyama aldol reaction, the tandem Claisen rearrangement/Cope rearrangement, and the tandem aldol reaction/siloxy-Cope rearrangement. In all of these cases, the products are generated with very high diastereocontrol. With a chiral dirhodium tetracarboxylate catalyst such as Rh2(S-DOSP)4 or Rh2(S-PTAD)4, researchers can achieve very high levels of asymmetric induction. Applications of the CHCR reaction include the effective enantiodifferentiation of racemic

  10. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Institute of Scientific and Technical Information of China (English)

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  11. Applications of animal models of infectious arthritis in drug discovery: a focus on alphaviral disease.

    Science.gov (United States)

    Herrero, Lara; Nelson, Michelle; Bettadapura, Jayaram; Gahan, Michelle E; Mahalingam, Suresh

    2011-06-01

    Animal models, which mimic human disease, are invaluable tools for understanding the mechanisms of disease pathogenesis and development of treatment strategies. In particular, animal models play important roles in the area of infectious arthritis. Alphaviruses, including Ross River virus (RRV), o'nyong-nyong virus, chikungunya virus (CHIKV), mayaro virus, Semliki Forest virus and sindbis virus, are globally distributed and cause transient illness characterized by fever, rash, myalgia, arthralgia and arthritis in humans. Severe forms of the disease result in chronic incapacitating arthralgia and arthritis. The mechanisms of how these viruses cause musculoskeletal disease are ill defined. In recent years, the use of a mouse model for RRV-induced disease has assisted in unraveling the pathobiology of infection and in discovering novel drugs to ameliorate disease. RRV as an infection model has the potential to provide key insights into such disease processes, particularly as many viruses, other than alphaviruses, are known to cause infectious arthritides. The emergence and outbreak of CHIKV in many parts of the world has necessitated the need to develop animal models of CHIKV disease. The development of non-human primate models of CHIKV disease has given insights into viral tropism and disease pathogenesis and facilitated the development of new treatment strategies. This review highlights the application of animal models of alphaviral diseases in the fundamental understanding of the mechanisms that contribute to disease and for defining the role that the immune response may have on disease pathogenesis, with the view of providing the foundation for new treatments.

  12. Quantum-Mechanics Methodologies in Drug Discovery: Applications of Docking and Scoring in Lead Optimization.

    Science.gov (United States)

    Crespo, Alejandro; Rodriguez-Granillo, Agustina; Lim, Victoria T

    2017-01-01

    The development and application of quantum mechanics (QM) methodologies in computer- aided drug design have flourished in the last 10 years. Despite the natural advantage of QM methods to predict binding affinities with a higher level of theory than those methods based on molecular mechanics (MM), there are only a few examples where diverse sets of protein-ligand targets have been evaluated simultaneously. In this work, we review recent advances in QM docking and scoring for those cases in which a systematic analysis has been performed. In addition, we introduce and validate a simplified QM/MM expression to compute protein-ligand binding energies. Overall, QMbased scoring functions are generally better to predict ligand affinities than those based on classical mechanics. However, the agreement between experimental activities and calculated binding energies is highly dependent on the specific chemical series considered. The advantage of more accurate QM methods is evident in cases where charge transfer and polarization effects are important, for example when metals are involved in the binding process or when dispersion forces play a significant role as in the case of hydrophobic or stacking interactions. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  13. Discovery and evaluation of candidate sex-determining genes and xenobiotics in the gonads of lake sturgeon (Acipenser fulvescens).

    Science.gov (United States)

    Hale, Matthew C; Jackson, James R; Dewoody, J Andrew

    2010-07-01

    Modern pyrosequencing has the potential to uncover many interesting aspects of genome evolution, even in lineages where genomic resources are scarce. In particular, 454 pyrosequencing of nonmodel species has been used to characterize expressed sequence tags, xenobiotics, gene ontologies, and relative levels of gene expression. Herein, we use pyrosequencing to study the evolution of genes expressed in the gonads of a polyploid fish, the lake sturgeon (Acipenser fulvescens). Using 454 pyrosequencing of transcribed genes, we produced more than 125 MB of sequence data from 473,577 high-quality sequencing reads. Sequences that passed stringent quality control thresholds were assembled into 12,791 male contigs and 32,629 female contigs. Average depth of coverage was 4.2 x for the male assembly and 5.5x for the female assembly. Analytical rarefaction indicates that our assemblies include most of the genes expressed in lake sturgeon gonads. Over 86,700 sequencing reads were assigned gene ontologies, many to general housekeeping genes like protein, RNA, and ion binding genes. We searched specifically for sex determining genes and documented significant sex differences in the expression of two genes involved in animal sex determination, DMRT1 and TRA-1. DMRT1 is the master sex determining gene in birds and in medaka (Oryzias latipes) whereas TRA-1 helps direct sexual differentiation in nematodes. We also searched the lake sturgeon assembly for evidence of xenobiotic organisms that may exist as endosymbionts. Our results suggest that exogenous parasites (trematodes) and pathogens (protozoans) apparently have infected lake sturgeon gonads, and the trematodes have horizontally transferred some genes to the lake sturgeon genome.

  14. 43 CFR 4.1130 - Discovery methods.

    Science.gov (United States)

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon...

  15. The Effect of Different Case Definitions of Current Smoking on the Discovery of Smoking-Related Blood Gene Expression Signatures in Chronic Obstructive Pulmonary Disease.

    Science.gov (United States)

    Obeidat, Ma'en; Ding, Xiaoting; Fishbane, Nick; Hollander, Zsuzsanna; Ng, Raymond T; McManus, Bruce; Tebbutt, Scott J; Miller, Bruce E; Rennard, Stephen; Paré, Peter D; Sin, Don D

    2016-09-01

    Smoking is the number one modifiable environmental risk factor for chronic obstructive pulmonary disease (COPD). Clinical, epidemiological and increasingly "omics" studies assess or adjust for current smoking status using only self-report, which may be inaccurate. Objective measures such as exhaled carbon monoxide (eCO) may also be problematic owing to limitations in the measurements and the relatively short half life of the molecule. In this study, we determined the impact of different case definitions of current cigarette smoking on gene expression in peripheral blood of patients with COPD. Peripheral blood gene expression from 573 former- and current-smokers with COPD in the ECLIPSE study was used to find genes whose expression was associated with smoking status. Current smoking was defined using self-report, eCO concentrations, or both. Linear regression was used to determine the association of current smoking status with gene expression adjusting for age, sex and propensity score. Pathway enrichment analyses were performed on genes with P PID1, FUCA1, GPR15) with enrichment in 40 biological pathways related to metabolic processes, response to hypoxia and hormonal stimulus. Additionally, the combined definition provided better distributions of test statistics for differential gene expression. A combined phenotype of eCO and self report allows for better discovery of genes and pathways related to current smoking. Studies relying only on self report of smoking status to assess or adjust for the impact of smoking may not fully capture its effect and will lead to residual confounding of results. © The Author 2016. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Development and application of the lux gene for environmental bioremediation

    Energy Technology Data Exchange (ETDEWEB)

    Burlage, R.S.; Yang, Z. [Oak Ridge National Lab., TN (United States). Environmental Sciences Div.; Palmer, R.J. [Univ. of Tennessee, Knoxville, TN (United States). Center for Environmental Biotechnology; Khang, Y. [Yeungnam Univ., Kyongsan (Korea, Republic of)

    1996-09-01

    Bioremediation is the use of living systems, usually microorganisms, to treat a quantity of soil or water for the presence of hazardous wastes. Bioremediation has many advantages over other remediation approaches, including cost savings, versatility, and the ability to treat the wastes in situ. In order to study the processes of microbial bioremediation, the authors have constructed bacterial strains that incorporate genetically engineered bioreporter genes. These bioreporter genes allow the bacteria to be detected during in situ processes, as manifested by their ability to bioluminescence or to fluoresce. This bioreporter microorganisms are described, along with the technology for detecting them and the projects which are benefiting from their application.

  17. Development and application of the lux gene for environmental bioremediation

    Science.gov (United States)

    Burlage, Robert S.; Yang, Zamin; Palmer, Robert J., Jr.; Sayler, Gary S.; Khang, Yongho

    1996-11-01

    Bioremediation is the use of living systems, usually microorganisms, to treat a quantity of soil or water for the presence of hazardous wastes. Bioremediation has many advantages over other remediation approaches, including cost savings, versatility, and the ability to treat the wastes in situ. In order to study the processes of microbial bioremediation, we have constructed bacterial strains that incorporate genetically engineered bioreporter genes. These bioreporter genes allow the bacteria to be detected during in situ processes, as manifested by their ability to bioluminesce or to fluoresce. This bioreporter microorganisms are described, along with the technology for detecting them and the projects which are benefiting from their application.

  18. Membrane Interactions of Phytochemicals as Their Molecular Mechanism Applicable to the Discovery of Drug Leads from Plants

    Directory of Open Access Journals (Sweden)

    Hironori Tsuchiya

    2015-10-01

    Full Text Available In addition to interacting with functional proteins such as receptors, ion channels, and enzymes, a variety of drugs mechanistically act on membrane lipids to change the physicochemical properties of biomembranes as reported for anesthetic, adrenergic, cholinergic, non-steroidal anti-inflammatory, analgesic, antitumor, antiplatelet, antimicrobial, and antioxidant drugs. As well as these membrane-acting drugs, bioactive plant components, phytochemicals, with amphiphilic or hydrophobic structures, are presumed to interact with biological membranes and biomimetic membranes prepared with phospholipids and cholesterol, resulting in the modification of membrane fluidity, microviscosity, order, elasticity, and permeability with the potencies being consistent with their pharmacological effects. A novel mechanistic point of view of phytochemicals would lead to a better understanding of their bioactivities, an insight into their medicinal benefits, and a strategic implication for discovering drug leads from plants. This article reviews the membrane interactions of different classes of phytochemicals by highlighting their induced changes in membrane property. The phytochemicals to be reviewed include membrane-interactive flavonoids, terpenoids, stilbenoids, capsaicinoids, phloroglucinols, naphthodianthrones, organosulfur compounds, alkaloids, anthraquinonoids, ginsenosides, pentacyclic triterpene acids, and curcuminoids. The membrane interaction’s applicability to the discovery of phytochemical drug leads is also discussed while referring to previous screening and isolating studies.

  19. Accelerating Novel Candidate Gene Discovery in Neurogenetic Disorders via Whole-Exome Sequencing of Prescreened Multiplex Consanguineous Families

    OpenAIRE

    Anas M. Alazami; Nisha Patel; Hanan E. Shamseldin; Shamsa Anazi; Mohammed S. Al-Dosari; Fatema Alzahrani; Hadia Hijazi; Muneera Alshammari; Mohammed A. Aldahmesh; Mustafa A. Salih; Eissa Faqeih; Amal Alhashem; Fahad A. Bashiri; Mohammed Al-Owain; Amal Y. Kentab

    2015-01-01

    Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, M...

  20. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2017-01-01

    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. PMID:27903896

  1. Discovery and application of peptides that bind to proteins and solid state inorganic materials

    Science.gov (United States)

    Stearns, Linda A.

    A series of three projects was undertaken on the theme of peptide-based molecular recognition. In the first project, a messenger RNA (mRNA) display selection was carried out against the II-VI semiconductors zinc sulfide (ZnS), zinc selenide (ZnSe), and cadmium sulfide (CdS). Sequence analysis of 18-mer semiconductor-binding peptides (SBPs) following four rounds of selection indicated that the amino acid sequences were enriched in polar residues compared to the naive library, suggesting that hydrogen-bonding interactions are a dominant mode of interaction between the SBPs and their cognate inorganic surfaces. Select peptides were expressed as fusions of the green fluorescent protein (GFP) to visualize their recognition of semiconductor crystals. Interpretation of the results was complicated by a high fluorescence background that was observed with certain control GFP fusions. Additional experiments, including cross-specificity binding assays, are needed to characterize the peptides that were isolated in this selection. A second project described the practical application of a known inorganic-binding and nucleating peptide. Peptide A3, which was previously isolated by phage display, was chemically conjugated to a short DNA strand using the heterobifunctional linker succinimidyl 4-[N-maleimidomethyl]cyclohexane-1-carboxylate (SMCC). The resulting peptide-DNA conjugate was hybridized to ten complementary single-stranded capture probes extending outward from the surface of an origami DNA nanotube. A gold precursor solution was added to initiate nucleation and growth of gold nanoparticles at the site of the peptide. Transmission electron microscopy (TEM) was used to visualize the gold nanoparticle-decorated nanostructures. This approach holds immense promise for organizing compositionally-diverse materials at the nanoscale. In a third project, a novel non-iterative approach to mRNA display called covalent capture was demonstrated. Using human transferrin as a target

  2. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    Energy Technology Data Exchange (ETDEWEB)

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  3. Expression of uncharacterized male germ cell-specific genes and discovery of novel sperm-tail proteins in mice.

    Science.gov (United States)

    Kwon, Jun Tae; Ham, Sera; Jeon, Suyeon; Kim, Youil; Oh, Seungmin; Cho, Chunghee

    2017-01-01

    The identification and characterization of germ cell-specific genes are essential if we hope to comprehensively understand the mechanisms of spermatogenesis and fertilization. Here, we searched the mouse UniGene databases and identified 13 novel genes as being putatively testis-specific or -predominant. Our in silico and in vitro analyses revealed that the expressions of these genes are testis- and germ cell-specific, and that they are regulated in a stage-specific manner during spermatogenesis. We generated antibodies against the proteins encoded by seven of the genes to facilitate their characterization in male germ cells. Immunoblotting and immunofluorescence analyses revealed that one of these proteins was expressed only in testicular germ cells, three were expressed in both testicular germ cells and testicular sperm, and the remaining three were expressed in sperm of the testicular stages and in mature sperm from the epididymis. Further analysis of the latter three proteins showed that they were all associated with cytoskeletal structures in the sperm flagellum. Among them, MORN5, which is predicted to contain three MORN motifs, is conserved between mouse and human sperm. In conclusion, we herein identify 13 authentic genes with male germ cell-specific expression, and provide comprehensive information about these genes and their encoded products. Our finding will facilitate future investigations into the functional roles of these novel genes in spermatogenesis and sperm functions.

  4. Coupled transcriptome and proteome analysis of human lymphotropic tumor viruses: insights on the detection and discovery of viral genes

    Directory of Open Access Journals (Sweden)

    Dresang Lindsay R

    2011-12-01

    Full Text Available Abstract Background Kaposi's sarcoma-associated herpesvirus (KSHV and Epstein-Barr virus (EBV are related human tumor viruses that cause primary effusion lymphomas (PEL and Burkitt's lymphomas (BL, respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  5. QTL mapping and candidate gene discovery in potato for resistance to the Verticillium wilt pathogen Verticillium dahliae

    Science.gov (United States)

    Verticillium wilt (VW) of potato (Solanum tuberosum), caused by fungal pathogens, Verticillium dahliae and V. albo atrum, is a disease of major significance throughout the potato growing regions in the world. In the past, researchers have focused on the Ve gene, which is a major dominant gene that c...

  6. Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA to fine-map genes in polyploid wheat

    Directory of Open Access Journals (Sweden)

    Trick Martin

    2012-01-01

    Full Text Available Abstract Background Next generation sequencing (NGS technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC gene GPC-B1. Bulked segregant analysis (BSA was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes.

  7. Discovery and Development of Synthetic and Natural Biomaterials for Protein Therapeutics and Medical Device Applications

    Science.gov (United States)

    Keefe, Andrew J.

    Controlling nonspecific protein interactions is important for applications from medical devices to protein therapeutics. The presented work is a compilation of efforts aimed at using zwitterionic (ionic yet charge neutral) polymers to modify and stabilize the surface of sensitive biomedical and biological materials. Traditionally, when modifying the surface of a material, the stability of the underlying substrate. The materials modified in this dissertation are unique due to their unconventional amorphous characteristics which provide additional challenges. These are poly(dimethyl siloxane) (PDMS) rubber, and proteins. These materials may seem dissimilar, but both have amorphous surfaces, that do not respond well to chemical modification. PDMS is a biomaterial extensively used in medical device manufacturing, but experiences unacceptably high levels of non-specific protein fouling when used with biological samples. To reduce protein fouling, surface modification is often needed. Unfortunately conventional surface modification methods, such as Poly(ethylene glycol) (PEG) coatings, do not work for PDMS due to its amorphous state. Herein, we demonstrate how a superhydrophilic zwitterionic material, poly(carboxybetaine methacrylate) (pCBMA), can provide a highly stable nonfouling coating with long term stability due to the sharp the contrast in hydrophobicity between pCBMA and PDMS. Biological materials, such as proteins, also require stabilization to improve shelf life, circulation time, and bioactivity. Conjugation of proteins with PEG is often used to increase protein stability, but has a detrimental effect on bioactivity. Here we have shown that pCBMA conjugation improves stability in a similar fashion to PEG, but also retains, or even improves, binding affinity due to enhanced protein-substrate hydrophobic interactions. Recognizing that pCBMA chemically resembles the combination of lysine (K) and glutamic acid (E) amino acids, we have shown how zwitterionic

  8. Drug discovery in a multidimensional world: systems, patterns, and networks.

    Science.gov (United States)

    Dudley, Joel T; Schadt, Eric; Sirota, Marina; Butte, Atul J; Ashley, Euan

    2010-10-01

    Despite great strides in revealing and understanding the physiological and molecular bases of cardiovascular disease, efforts to translate this understanding into needed therapeutic interventions continue to lag far behind the initial discoveries. Although pharmaceutical companies continue to increase investments into research and development, the number of drugs gaining federal approval is in decline. Many factors underlie these trends, and a vast number of technological and scientific innovations are being sought through efforts to reinvigorate drug discovery pipelines. Recent advances in molecular profiling technologies and development of sophisticated computational approaches for analyzing these data are providing new, systems-oriented approaches towards drug discovery. Unlike the traditional approach to drug discovery which is typified by a one-drug-one-target mindset, systems-oriented approaches to drug discovery leverage the parallelism and high-dimensionality of the molecular data to construct more comprehensive molecular models that aim to model broader bimolecular systems. These models offer a means to explore complex molecular states (e.g., disease) where thousands to millions of molecular entities comprising multiple molecular data types (e.g., proteomics and gene expression) can be evaluated simultaneously as components of a cohesive biomolecular system. In this paper, we discuss emerging approaches towards systems-oriented drug discovery and contrast these efforts with the traditional, unidimensional approach to drug discovery. We also highlight several applications of these system-oriented approaches across various aspects of drug discovery, including target discovery, drug repositioning and drug toxicity. When available, specific applications to cardiovascular drug discovery are highlighted and discussed.

  9. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Rongning Liu

    Full Text Available In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina. 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  10. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    Science.gov (United States)

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  11. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    Science.gov (United States)

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    Exocarp color of sand pear is an important trait for the fruit production and has caused our concern for a long time. Our previous study explored the different expression genes between the two genotypes contrasting for exocarp color, which indicated the different suberin, cutin, wax and lignin biosynthesis between the russet- and green-exocarp. In this study, we carried out microscopic observation and Fourier transform infrared spectroscopy analysis to detect the differences of tissue structure and biochemical composition between the russet- and green-exocarp of sand pear. The green exocarp was covered with epidermis and cuticle which was replaced by a cork layer on the surface of russet exocarp, and the chemicals of the russet exocarp were characterized by lignin, cellulose and hemicellulose. We explored differential gene expression between the russet exocarp of 'Niitaka' and its green exocarp mutant cv. 'Suisho' using Illumina RNA-sequencing. A total of 559 unigenes showed different expression between the two types of exocarp, and 123 of them were common to the previous study. The quantitative real time-PCR analysis supports the RNA-seq-derived gene with different expression between the two types of exocarp and revealed the preferential expression of these genes in exocarp than in mesocarp and fruit core. Gene ontology enrichment analysis revealed divorced expression of lipid metabolic process genes, transport genes, stress responsive genes and other biological process genes in the two types of exocarp. Expression changes in lignin metabolism-related genes were consistent with the different pigmentation of russet and green exocarp. Increased transcripts of putative genes involved the suberin, cutin and wax biosynthesis in 'Suisho' exocarp could facilitate deposition of the chemicals and take a role in the mutant trait responsible for the green exocarp. In addition, the divorced expression of ATP-binding cassette transporters involved in the trans

  12. Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.

    Science.gov (United States)

    Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick

    2013-01-01

    Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.

  13. A general co-expression network-based approach to gene expression analysis: comparison and applications

    Directory of Open Access Journals (Sweden)

    Zhang Weixiong

    2010-02-01

    Full Text Available Abstract Background Co-expression network-based approaches have become popular in analyzing microarray data, such as for detecting functional gene modules. However, co-expression networks are often constructed by ad hoc methods, and network-based analyses have not been shown to outperform the conventional cluster analyses, partially due to the lack of an unbiased evaluation metric. Results Here, we develop a general co-expression network-based approach for analyzing both genes and samples in microarray data. Our approach consists of a simple but robust rank-based network construction method, a parameter-free module discovery algorithm and a novel reference network-based metric for module evaluation. We report some interesting topological properties of rank-based co-expression networks that are very different from that of value-based networks in the literature. Using a large set of synthetic and real microarray data, we demonstrate the superior performance of our approach over several popular existing algorithms. Applications of our approach to yeast, Arabidopsis and human cancer microarray data reveal many interesting modules, including a fatal subtype of lymphoma and a gene module regulating yeast telomere integrity, which were missed by the existing methods. Conclusions We demonstrated that our novel approach is very effective in discovering the modular structures in microarray data, both for genes and for samples. As the method is essentially parameter-free, it may be applied to large data sets where the number of clusters is difficult to estimate. The method is also very general and can be applied to other types of data. A MATLAB implementation of our algorithm can be downloaded from http://cs.utsa.edu/~jruan/Software.html.

  14. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    Directory of Open Access Journals (Sweden)

    Huie Li

    Full Text Available Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S

  15. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    Science.gov (United States)

    Li, Huie; Yao, Weijie; Fu, Yaru; Li, Shaoke; Guo, Qiqiang

    2015-01-01

    Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S. moorcroftiana transcriptome

  16. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Ju-Chun Hsu

    Full Text Available Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS. The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs. A total of 29,067 isotigs have putative homologues in the non-redundant (nr protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also

  17. Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome

    Science.gov (United States)

    Hsu, Ju-Chun; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S.; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  18. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  19. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes

    Science.gov (United States)

    Elso, Colleen M.; Chu, Edward P. F.; Alsayb, May A.; Mackin, Leanne; Ivory, Sean T.; Ashton, Michelle P.; Bröer, Stefan; Silveira, Pablo A.; Brodnicki, Thomas C.

    2015-01-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying “natural” alleles in the human population is to engineer “artificial” alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  20. Accelerating Novel Candidate Gene Discovery in Neurogenetic Disorders via Whole-Exome Sequencing of Prescreened Multiplex Consanguineous Families

    Directory of Open Access Journals (Sweden)

    Anas M. Alazami

    2015-01-01

    Full Text Available Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS. We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  1. Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families.

    Science.gov (United States)

    Alazami, Anas M; Patel, Nisha; Shamseldin, Hanan E; Anazi, Shamsa; Al-Dosari, Mohammed S; Alzahrani, Fatema; Hijazi, Hadia; Alshammari, Muneera; Aldahmesh, Mohammed A; Salih, Mustafa A; Faqeih, Eissa; Alhashem, Amal; Bashiri, Fahad A; Al-Owain, Mohammed; Kentab, Amal Y; Sogaty, Sameera; Al Tala, Saeed; Temsah, Mohamad-Hani; Tulbah, Maha; Aljelaify, Rasha F; Alshahwan, Saad A; Seidahmed, Mohammed Zain; Alhadid, Adnan A; Aldhalaan, Hesham; AlQallaf, Fatema; Kurdi, Wesam; Alfadhel, Majid; Babay, Zainab; Alsogheer, Mohammad; Kaya, Namik; Al-Hassnan, Zuhair N; Abdel-Salam, Ghada M H; Al-Sannaa, Nouriya; Al Mutairi, Fuad; El Khashab, Heba Y; Bohlega, Saeed; Jia, Xiaofei; Nguyen, Henry C; Hammami, Rakad; Adly, Nouran; Mohamed, Jawahir Y; Abdulwahab, Firdous; Ibrahim, Niema; Naim, Ewa A; Al-Younes, Banan; Meyer, Brian F; Hashem, Mais; Shaheen, Ranad; Xiong, Yong; Abouelhoda, Mohamed; Aldeeri, Abdulrahman A; Monies, Dorota M; Alkuraya, Fowzan S

    2015-01-13

    Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS). We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  2. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  3. APPLICATION OF GENETIC DEAFNESS GENE CHIP FOR DETECTION OF GENE MUTATION OF DEAFNESS IN PREGNANT WOMEN

    Institute of Scientific and Technical Information of China (English)

    CHANG Liang; ZHONG Su; ZHAO Nan; LIU Ping; ZHAO Yangyu; QIAO Jie

    2014-01-01

    Objective The study is to identify the carrier rate of common deafness mutation in Chinese pregnant women via detecting deafness gene mutations with gene chip. Methods The pregnant women in obstetric clinic without hearing impairment and hearing disorders family history were selected. The informed consent was signed. Peripheral blood was taken to extract genom-ic DNA. Application of genetic deafness gene chip for detecting 9 mutational hot spot of the most common 4 Chinese deafness genes, namely GJB2 (35delG,176del16bp, 235delC, 299delAT), GJB3 (C538T) ,SLC26A4 ( IVS72A>G, A2168G) and mito-chondrial DNA 12S rRNA (A1555G, C1494T) . Further genetic testing were provided to the spouses and newborns of the screened carriers. Results Peripheral blood of 430 pregnant women were detected,detection of deafness gene mutation carri-ers in 24 cases(4.2%), including 13 cases of the GJB2 heterozygous mutation, 3 cases of SLC26A4 heterozygous mutation, 1 cases of GJB3 heterozygous mutation, and 1 case of mitochondrial 12S rRNA mutation. 18 spouses and 17 newborns took fur-ther genetic tests, and 6 newborns inherited the mutation from their mother. Conclusion The common deafness genes muta-tion has a high carrier rate in pregnant women group,235delC and IVS7-2A>G heterozygous mutations are common.

  4. Application of a high throughput method of biomarker discovery to improvement of the EarlyCDT(®-Lung Test.

    Directory of Open Access Journals (Sweden)

    Isabel K Macdonald

    Full Text Available BACKGROUND: The National Lung Screening Trial showed that CT screening for lung cancer led to a 20% reduction in mortality. However, CT screening has a number of disadvantages including low specificity. A validated autoantibody assay is available commercially (EarlyCDT®-Lung to aid in the early detection of lung cancer and risk stratification in patients with pulmonary nodules detected by CT. Recent advances in high throughput (HTP cloning and expression methods have been developed into a discovery pipeline to identify biomarkers that detect autoantibodies. The aim of this study was to demonstrate the successful clinical application of this strategy to add to the EarlyCDT-Lung panel in order to improve its sensitivity and specificity (and hence positive predictive value, (PPV. METHODS AND FINDINGS: Serum from two matched independent cohorts of lung cancer patients were used (n = 100 and n = 165. Sixty nine proteins were initially screened on an abridged HTP version of the autoantibody ELISA using protein prepared on small scale by a HTP expression and purification screen. Promising leads were produced in shake flask culture and tested on the full assay. These results were analyzed in combination with those from the EarlyCDT-Lung panel in order to provide a set of re-optimized cut-offs. Five proteins that still displayed cancer/normal differentiation were tested for reproducibility and validation on a second batch of protein and a separate patient cohort. Addition of these proteins resulted in an improvement in the sensitivity and specificity of the test from 38% and 86% to 49% and 93% respectively (PPV improvement from 1 in 16 to 1 in 7. CONCLUSION: This is a practical example of the value of investing resources to develop a HTP technology. Such technology may lead to improvement in the clinical utility of the EarlyCDT--Lung test, and so further aid the early detection of lung cancer.

  5. Application of a high throughput method of biomarker discovery to improvement of the EarlyCDT(®)-Lung Test.

    Science.gov (United States)

    Macdonald, Isabel K; Murray, Andrea; Healey, Graham F; Parsy-Kowalska, Celine B; Allen, Jared; McElveen, Jane; Robertson, Chris; Sewell, Herbert F; Chapman, Caroline J; Robertson, John F R

    2012-01-01

    The National Lung Screening Trial showed that CT screening for lung cancer led to a 20% reduction in mortality. However, CT screening has a number of disadvantages including low specificity. A validated autoantibody assay is available commercially (EarlyCDT®-Lung) to aid in the early detection of lung cancer and risk stratification in patients with pulmonary nodules detected by CT. Recent advances in high throughput (HTP) cloning and expression methods have been developed into a discovery pipeline to identify biomarkers that detect autoantibodies. The aim of this study was to demonstrate the successful clinical application of this strategy to add to the EarlyCDT-Lung panel in order to improve its sensitivity and specificity (and hence positive predictive value, (PPV)). Serum from two matched independent cohorts of lung cancer patients were used (n = 100 and n = 165). Sixty nine proteins were initially screened on an abridged HTP version of the autoantibody ELISA using protein prepared on small scale by a HTP expression and purification screen. Promising leads were produced in shake flask culture and tested on the full assay. These results were analyzed in combination with those from the EarlyCDT-Lung panel in order to provide a set of re-optimized cut-offs. Five proteins that still displayed cancer/normal differentiation were tested for reproducibility and validation on a second batch of protein and a separate patient cohort. Addition of these proteins resulted in an improvement in the sensitivity and specificity of the test from 38% and 86% to 49% and 93% respectively (PPV improvement from 1 in 16 to 1 in 7). This is a practical example of the value of investing resources to develop a HTP technology. Such technology may lead to improvement in the clinical utility of the EarlyCDT--Lung test, and so further aid the early detection of lung cancer.

  6. Representation Discovery using Harmonic Analysis

    CERN Document Server

    Mahadevan, Sridhar

    2008-01-01

    Representations are at the heart of artificial intelligence (AI). This book is devoted to the problem of representation discovery: how can an intelligent system construct representations from its experience? Representation discovery re-parameterizes the state space - prior to the application of information retrieval, machine learning, or optimization techniques - facilitating later inference processes by constructing new task-specific bases adapted to the state space geometry. This book presents a general approach to representation discovery using the framework of harmonic analysis, in particu

  7. Strigolactone biology: genes, functional genomics, epigenetics and applications.

    Science.gov (United States)

    Makhzoum, Abdullah; Yousefzadi, Morteza; Malik, Sonia; Gantet, Pascal; Tremouillaux-Guiller, Jocelyne

    2017-03-01

    Strigolactones (SLs) represent an important new plant hormone class marked by their multifunctional role in plant and rhizosphere interactions. These compounds stimulate hyphal branching in arbuscular mycorrhizal fungi (AMF) and seed germination of root parasitic plants. In addition, they are involved in the control of plant architecture by inhibiting bud outgrowth as well as many other morphological and developmental processes together with other plant hormones such as auxins and cytokinins. The biosynthetic pathway of SLs that are derived from carotenoids was partially decrypted based on the identification of mutants from a variety of plant species. Only a few SL biosynthetic and regulated genes and related regulatory transcription factors have been identified. However, functional genomics and epigenetic studies started to give first elements on the modality of the regulation of SLs related genes. Since they control plant architecture and plant-rhizosphere interaction, SLs start to be used for agronomical and biotechnological applications. Furthermore, the genes involved in the SL biosynthetic pathway and genes regulated by SL constitute interesting targets for plant breeding. Therefore, it is necessary to decipher and better understand the genetic determinants of their regulation at different levels.

  8. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    Directory of Open Access Journals (Sweden)

    Karla A Salazar

    Full Text Available In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  9. Functional Gene-Guided Discovery of Type II Polyketides from Culturable Actinomycetes Associated with Soft Coral Scleronephthya sp

    Science.gov (United States)

    Sun, Wei; Peng, Chongsheng; Zhao, Yunyu; Li, Zhiyong

    2012-01-01

    Compared with the actinomycetes in stone corals, the phylogenetic diversity of soft coral-associated culturable actinomycetes is essentially unexplored. Meanwhile, the knowledge of the natural products from coral-associated actinomycetes is very limited. In this study, thirty-two strains were isolated from the tissue of the soft coral Scleronephthya sp. in the East China Sea, which were grouped into eight genera by 16S rDNA phylogenetic analysis: Micromonospora, Gordonia, Mycobacterium, Nocardioides, Streptomyces, Cellulomonas, Dietzia and Rhodococcus. 6 Micromonospora strains and 4 Streptomyces strains were found to be with the potential for producing aromatic polyketides based on the analysis of KSα (ketoacyl-synthase) gene in the PKS II (type II polyketides synthase) gene cluster. Among the 6 Micromonospora strains, angucycline cyclase gene was amplified in 2 strains (A5-1 and A6-2), suggesting their potential in synthesizing angucyclines e.g. jadomycin. Under the guidance of functional gene prediction, one jadomycin B analogue (7b, 13-dihydro-7-O-methyl jadomycin B) was detected in the fermentation broth of Micromonospora sp. strain A5-1. This study highlights the phylogenetically diverse culturable actinomycetes associated with the tissue of soft coral Scleronephthya sp. and the potential of coral-derived actinomycetes especially Micromonospora in producing aromatic polyketides. PMID:22880121

  10. Large-scale gene discovery in the oomycete Phytophthora infestans reveals likely components of phytopathogenicity shared with true fungi

    NARCIS (Netherlands)

    Randall, T.A.; Dwyer, R.A.; Huitema, E.; Beyer, K.; Cvitanich, C.; Kelkar, H.; Ah Fong, A.M.V.; Gates, K.; Roberts, S.; Yatzkan, E.; Gaffney, T.; Law, M.; Testa, A.; Torto-Alalibo, T.; Zhang Meng,; Zheng Li,; Mueller, E.; Windass, J.; Binder, A.; Birch, P.R.J.; Gisi, U.; Govers, F.; Gow, N.A.; Mauch, F.; West, van P.; Waugh, M.E.; Yu Jun,; Boller, T.; Kamoun, S.; Lam, S.T.; Judelson, H.S.

    2005-01-01

    o overview the gene content of the important pathogen Phytophthora infestans, large-scale cDNA and genomic sequencing was performed. A set of 75,757 high-quality expressed sequence tags (ESTs) from P. infestans was obtained from 20 cDNA libraries representing a broad range of growth conditions, stre

  11. A population of deletion mutants and an integrated mapping and Exome-seq pipeline for gene discovery in maize

    Science.gov (United States)

    To better understand maize endosperm filling and maturation, we developed a novel functional genomics platform that combined Bulked Segregant RNA and Exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. Using gamma-irradiation of B73 maize to...

  12. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    Science.gov (United States)

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.

  13. Transcriptome profiling of the testis reveals genes involved in spermatogenesis and marker discovery in the oriental fruit fly, Bactrocera dorsalis.

    Science.gov (United States)

    Wei, D; Li, H-M; Yang, W-J; Wei, D-D; Dou, W; Huang, Y; Wang, J-J

    2015-02-01

    The testis is a highly specialized tissue that plays a vital role in ensuring fertility by producing spermatozoa, which are transferred to the female during mating. Spermatogenesis is a complex process, resulting in the production of mature sperm, and involves significant structural and biochemical changes in the seminiferous epithelium of the adult testis. The identification of genes involved in spermatogenesis of Bactrocera dorsalis (Hendel) is critical for a better understanding of its reproductive development. In this study, we constructed a cDNA library of testes from male B. dorsalis adults at different ages, and performed de novo transcriptome sequencing to produce a comprehensive transcript data set, using Illumina sequencing technology. The analysis yielded 52 016 732 clean reads, including a total of 4.65 Gb of nucleotides. These reads were assembled into 47 677 contigs (average 443 bp) and then clustered into 30 516 unigenes (average 756 bp). Based on BLAST hits with known proteins in different databases, 20 921 unigenes were annotated with a cut-off E-value of 10(-5). The transcriptome sequences were further annotated using the Clusters of Orthologous Groups, Gene Orthology and the Kyoto Encyclopedia of Genes and Genomes databases. Functional genes involved in spermatogenesis were analysed, including cell cycle proteins, metalloproteins, actin, and ubiquitin and antihyperthermia proteins. Several testis-specific genes were also identified. The transcripts database will help us to understand the molecular mechanisms underlying spermatogenesis in B. dorsalis. Furthermore, 2913 simple sequence repeats and 151 431 single nucleotide polymorphisms were identified, which will be useful for investigating the genetic diversity of B. dorsalis in the future. © 2014 The Royal Entomological Society.

  14. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens

    Science.gov (United States)

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J.; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Vecchio, Lucia Del; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J.; Novak, Jan; Julian, Bruce A.; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P.; Gharavi, Ali G.

    2014-01-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six novel genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geo-spatial distribution of risk alleles is highly suggestive of multi-locus adaptation and the genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  15. Positron emission tomography reporter genes and reporter probes: gene and cell therapy applications.

    Science.gov (United States)

    Yaghoubi, Shahriar S; Campbell, Dean O; Radu, Caius G; Czernin, Johannes

    2012-01-01

    Positron emission tomography (PET) imaging reporter genes (IRGs) and PET reporter probes (PRPs) are amongst the most valuable tools for gene and cell therapy. PET IRGs/PRPs can be used to non-invasively monitor all aspects of the kinetics of therapeutic transgenes and cells in all types of living mammals. This technology is generalizable and can allow long-term kinetics monitoring. In gene therapy, PET IRGs/PRPs can be used for whole-body imaging of therapeutic transgene expression, monitoring variations in the magnitude of transgene expression over time. In cell or cellular gene therapy, PET IRGs/PRPs can be used for whole-body monitoring of therapeutic cell locations, quantity at all locations, survival and proliferation over time and also possibly changes in characteristics or function over time. In this review, we have classified PET IRGs/PRPs into two groups based on the source from which they were derived: human or non-human. This classification addresses the important concern of potential immunogenicity in humans, which is important for expansion of PET IRG imaging in clinical trials. We have then discussed the application of this technology in gene/cell therapy and described its use in these fields, including a summary of using PET IRGs/PRPs in gene and cell therapy clinical trials. This review concludes with a discussion of the future direction of PET IRGs/PRPs and recommends cell and gene therapists collaborate with molecular imaging experts early in their investigations to choose a PET IRG/PRP system suitable for progression into clinical trials.

  16. Positron Emission Tomography Reporter Genes and Reporter Probes: Gene and Cell Therapy Applications

    Directory of Open Access Journals (Sweden)

    Shahriar S. Yaghoubi, Dean O. Campbell, Caius G. Radu, Johannes Czernin

    2012-01-01

    Full Text Available Positron emission tomography (PET imaging reporter genes (IRGs and PET reporter probes (PRPs are amongst the most valuable tools for gene and cell therapy. PET IRGs/PRPs can be used to non-invasively monitor all aspects of the kinetics of therapeutic transgenes and cells in all types of living mammals. This technology is generalizable and can allow long-term kinetics monitoring. In gene therapy, PET IRGs/PRPs can be used for whole-body imaging of therapeutic transgene expression, monitoring variations in the magnitude of transgene expression over time. In cell or cellular gene therapy, PET IRGs/PRPs can be used for whole-body monitoring of therapeutic cell locations, quantity at all locations, survival and proliferation over time and also possibly changes in characteristics or function over time. In this review, we have classified PET IRGs/PRPs into two groups based on the source from which they were derived: human or non-human. This classification addresses the important concern of potential immunogenicity in humans, which is important for expansion of PET IRG imaging in clinical trials. We have then discussed the application of this technology in gene/cell therapy and described its use in these fields, including a summary of using PET IRGs/PRPs in gene and cell therapy clinical trials. This review concludes with a discussion of the future direction of PET IRGs/PRPs and recommends cell and gene therapists collaborate with molecular imaging experts early in their investigations to choose a PET IRG/PRP system suitable for progression into clinical trials.

  17. Discovery of a novel glucagon-like peptide (GCGL) and its receptor (GCGLR) in chickens: evidence for the existence of GCGL and GCGLR genes in nonmammalian vertebrates.

    Science.gov (United States)

    Wang, Yajun; Meng, Fengyan; Zhong, Yu; Huang, Guian; Li, Juan

    2012-11-01

    Glucagon (GCG), glucagon-related peptides, and their receptors have been reported to play important roles including the regulation of glucose homeostasis, gastrointestinal activity, and food intake in vertebrates. In this study, we identified genes encoding a novel glucagon-like peptide (named GCGL) and its receptor (GCGLR) from adult chicken brain using RACE and/or RT-PCR. GCGL was predicted to encode a peptide of 29 amino acids (cGCGL(1-29)), which shares high amino acid sequence identity with mammalian and chicken GCG (62-66%). GCGLR is a receptor of 430 amino acids and shares relatively high amino acid sequence identity (53-55%) with the vertebrate GCG receptor (GCGR). Using a pGL3-CRE-luciferase reporter system, we demonstrated that synthetic cGCGL(1-29), but not its structurally related peptides, i.e. exendin-4 and GCG, could potently activate GCGLR (EC(50): 0.10 nm) expressed in Chinese hamster ovary cells, indicating that GCGLR can function as a GCGL-specific receptor. RT-PCR assay revealed that GCGL expression is mainly restricted to several tissues including various brain regions, spinal cord, and testes, whereas GCGLR mRNA is widely expressed in adult chicken tissues with abundant expression noted in the pituitary, spinal cord, and various brain regions. Using synteny analysis, GCGL and GCGLR genes were also identified in the genomes of fugu, tetraodon, tilapia, medaka, coelacanth, and Xenopus tropicalis. As a whole, the discovery of GCGL and GCGLR genes in chickens and other nonmammalian vertebrates clearly indicates a previously unidentified role of GCGL-GCGLR in nonmammalian vertebrates and provides important clues to the evolutionary history of GCG and GCGL genes in vertebrates.

  18. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Science.gov (United States)

    Hsu, Yi-Hsiang; Zillikens, M Carola; Wilson, Scott G; Farber, Charles R; Demissie, Serkalem; Soranzo, Nicole; Bianchi, Estelle N; Grundberg, Elin; Liang, Liming; Richards, J Brent; Estrada, Karol; Zhou, Yanhua; van Nas, Atila; Moffatt, Miriam F; Zhai, Guangju; Hofman, Albert; van Meurs, Joyce B; Pols, Huibert A P; Price, Roger I; Nilsson, Olle; Pastinen, Tomi; Cupples, L Adrienne; Lusis, Aldons J; Schadt, Eric E; Ferrari, Serge; Uitterlinden, André G; Rivadeneira, Fernando; Spector, Timothy D; Karasik, David; Kiel, Douglas P

    2010-06-10

    Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS) have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD) at the lumbar spine (LS) and femoral neck (FN), as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW). A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8)), 2q11.2 (TBC1D8), and 18q11.2 (OSBPL1A), and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13); SOX6, p = 6.4x10(-10)) associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD) did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant to the

  19. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    Directory of Open Access Journals (Sweden)

    Yi-Hsiang Hsu

    2010-06-01

    Full Text Available Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD at the lumbar spine (LS and femoral neck (FN, as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW. A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8, 2q11.2 (TBC1D8, and 18q11.2 (OSBPL1A, and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13; SOX6, p = 6.4x10(-10 associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant

  20. Discovery of a phosphor for light emitting diode applications and its structural determination, Ba(Si,Al)5(O,N)8:Eu2+.

    Science.gov (United States)

    Park, Woon Bae; Singh, Satendra Pal; Sohn, Kee-Sun

    2014-02-12

    Most of the novel phosphors that appear in the literature are either a variant of well-known materials or a hybrid material consisting of well-known materials. This situation has actually led to intellectual property (IP) complications in industry and several lawsuits have been the result. Therefore, the definition of a novel phosphor for use in light-emitting diodes should be clarified. A recent trend in phosphor-related IP applications has been to focus on the novel crystallographic structure, so that a slight composition variance and/or the hybrid of a well-known material would not qualify from either a scientific or an industrial point of view. In our previous studies, we employed a systematic materials discovery strategy combining heuristics optimization and a high-throughput process to secure the discovery of genuinely novel and brilliant phosphors that would be immediately ready for use in light emitting diodes. Despite such an achievement, this strategy requires further refinement to prove its versatility under any circumstance. To accomplish such demands, we improved our discovery strategy by incorporating an elitism-involved nondominated sorting genetic algorithm (NSGA-II) that would guarantee the discovery of truly novel phosphors in the present investigation. Using the improved discovery strategy, we discovered an Eu(2+)-doped AB5X8 (A = Sr or Ba, B = Si and Al, X = O and N) phosphor in an orthorhombic structure (A21am) with lattice parameters a = 9.48461(3) Å, b = 13.47194(6) Å, c = 5.77323(2) Å, α = β = γ = 90°, which cannot be found in any of the existing inorganic compound databases.

  1. Gene Discovery and Advances in Finger Millet [Eleusine coracana (L.) Gaertn.] Genomics—An Important Nutri-Cereal of Future

    Science.gov (United States)

    Sood, Salej; Kumar, Anil; Babu, B. Kalyana; Gaur, Vikram S.; Pandey, Dinesh; Kant, Lakshmi; Pattnayak, Arunava

    2016-01-01

    The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat) of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [Eleusine coracana (L.) Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologs from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2–4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet. PMID:27881984

  2. Functional gene-based discovery of phenazines from the actinobacteria associated with marine sponges in the South China Sea.

    Science.gov (United States)

    Karuppiah, Valliappan; Li, Yingxin; Sun, Wei; Feng, Guofang; Li, Zhiyong

    2015-07-01

    Phenazines represent a large group of nitrogen-containing heterocyclic compounds produced by the diverse group of bacteria including actinobacteria. In this study, a total of 197 actinobacterial strains were isolated from seven different marine sponge species in the South China Sea using five different culture media. Eighty-seven morphologically different actinobacterial strains were selected and grouped into 13 genera, including Actinoalloteichus, Kocuria, Micrococcus, Micromonospora, Mycobacterium, Nocardiopsis, Prauserella, Rhodococcus, Saccharopolyspora, Salinispora, Serinicoccus, and Streptomyces by the phylogenetic analysis of 16S rRNA gene. Based on the screening of phzE genes, ten strains, including five Streptomyces, two Nocardiopsis, one Salinispora, one Micrococcus, and one Serinicoccus were found to be potential for phenazine production. The level of phzE gene expression was highly expressed in Nocardiopsis sp. 13-33-15, 13-12-13, and Serinicoccus sp. 13-12-4 on the fifth day of fermentation. Finally, 1,6-dihydroxy phenazine (1) from Nocardiopsis sp. 13-33-15 and 13-12-13, and 1,6-dimethoxy phenazine (2) from Nocardiopsis sp. 13-33-15 were isolated and identified successfully based on ESI-MS and NMR analysis. The compounds 1 and 2 showed antibacterial activity against Bacillus mycoides SJ14, Staphylococcus aureus SJ51, Escherichia coli SJ42, and Micrococcus luteus SJ47. This study suggests that the integrated approach of gene screening and chemical analysis is an effective strategy to find the target compounds and lays the basis for the production of phenazine from the sponge-associated actinobacteria.

  3. De Novo Transcriptomic Analysis of an Oleaginous Microalga: Pathway Description and Gene Discovery for Production of Next-Generation Biofuels

    Science.gov (United States)

    Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu

    2012-01-01

    Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352

  4. Gene Discovery and Advances in Finger millet [Eleusine coracana (L. Gaertn.] Genomics - An Important Nutri-cereal of Future

    Directory of Open Access Journals (Sweden)

    Salej Sood

    2016-11-01

    Full Text Available The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [Eleusine coracana (L. Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologues from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2-4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet.

  5. De novo transcriptomic analysis of an oleaginous microalga: pathway description and gene discovery for production of next-generation biofuels.

    Directory of Open Access Journals (Sweden)

    LingLin Wan

    Full Text Available BACKGROUND: Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. RESULTS: We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. CONCLUSIONS: Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  6. The U.S. Geological Survey Ecosystem Science Strategy, 2012-2022 - Advancing discovery and application through collaboration

    Science.gov (United States)

    Williams, Byron K.; Wingard, G. Lynn; Brewer, Gary; Cloern, James; Gelfenbaum, Guy; Jacobson, Robert B.; Kershner, Jeffrey L.; McGuire, Anthony David; Nichols, James D.; Shapiro, Carl D.; van Riper, Charles; White, Robin P.

    2012-01-01

    Ecosystem science is critical to making informed decisions about natural resources that can sustain our Nation's economic and environmental well-being. Resource managers and policy-makers are faced with countless decisions each year at local, state, tribal, territorial, and national levels on issues as diverse as renewable and non-renewable energy development, agriculture, forestry, water supply, and resource allocations at the urban-rural interface. The urgency for sound decision-making is increasing dramatically as the world is being transformed at an unprecedented pace and in uncertain directions. Environmental changes are associated with natural hazards, greenhouse gas emissions, and increasing demands for water, land, food, energy, mineral, and living resources. At risk is the Nation's environmental capital, the goods and services provided by resilient ecosystems that are vital to the health and well-being of human societies. Ecosystem science - the study of systems of organisms interacting with their environment and the consequences of natural and human-induced change on these systems - is necessary to inform decision-makers as they develop policies to adapt to these changes. This Ecosystems Science Strategy is built on a framework that includes basic and applied science. It highlights the critical roles that USGS scientists and partners can play in building scientific understanding and providing timely information to decision-makers. The strategy underscores the connection between scientific discoveries and the application of new knowledge. The strategy integrates ecosystem science and decision-making, producing new scientific outcomes to assist resource managers and providing public benefits. The USGS is uniquely positioned to play an important role in ecosystem science. With its wide range of expertise, the agency can bring holistic, cross-scale, interdisciplinary capabilities to the design and conduct of monitoring, research, and modeling and to new

  7. Beyond Discovery

    DEFF Research Database (Denmark)

    Korsgaard, Steffen; Sassmannshausen, Sean Patrick

    2015-01-01

    as their central concepts and conceptualization of the entrepreneurial function. On this basis we discuss three central themes that cut across the four alternatives: process, uncertainty, and agency. These themes provide new foci for entrepreneurship research and can help to generate new research questions......In this chapter we explore four alternatives to the dominant discovery view of entrepreneurship; the development view, the construction view, the evolutionary view, and the Neo-Austrian view. We outline the main critique points of the discovery presented in these four alternatives, as well...

  8. BlueGene/L Applications: Parallelism on a Massive Scale

    Energy Technology Data Exchange (ETDEWEB)

    de Supinski, B R; Schulz, M; Bulatov, V V; Cabot, W; Chan, B; Cook, A W; Draeger, E W; Glosli, J N; Greenough, J A; Henderson, K; Kubota, A; Louis, S; Miller, B J; Patel, M V; Spelce, T E; Streitz, F H; Williams, P L; Yates, R K; Yoo, A; Almasi, G; Bhanot, G; Gara, A; Gunnels, J A; Gupta, M; Moreira, J; Sexton, J; Walkup, B; Archer, C; Gygi, F; Germann, T C; Kadau, K; Lomdahl, P S; Rendleman, C; Welcome, M L; McLendon, W; Hendrickson, B; Franchetti, F; Lorenz, J; Uberhuber, C W; Chow, E; Catalyurek, U

    2006-09-08

    BlueGene/L (BG/L), developed through a partnership between IBM and Lawrence Livermore National Laboratory (LLNL), is currently the world's largest system both in terms of scale with 131,072 processors and absolute performance with a peak rate of 367 TFlop/s. BG/L has led the Top500 list the last four times with a Linpack rate of 280.6 TFlop/s for the full machine installed at LLNL and is expected to remain the fastest computer in the next few editions. However, the real value of a machine like BG/L derives from the scientific breakthroughs that real applications can produce by successfully using its unprecedented scale and computational power. In this paper, we describe our experiences with eight large scale applications on BG/L from several application domains, ranging from molecular dynamics to dislocation dynamics and turbulence simulations to searches in semantic graphs. We also discuss the challenges we faced when scaling these codes and present several successful optimization techniques. All applications show excellent scaling behavior, even at very large processor counts, with one code even achieving a sustained performance of more than 100 TFlop/s, clearly demonstrating the real success of the BG/L design.

  9. De novo transcriptome assembly of Ipomoea nil using Illumina sequencing for gene discovery and SSR marker identification.

    Science.gov (United States)

    Wei, Changhe; Tao, Xiang; Li, Ming; He, Bin; Yan, Lang; Tan, Xuemei; Zhang, Yizheng

    2015-10-01

    Ipomoea nil is widely used as an ornamental plant due to its abundance of flower color, but the limited transcriptome and genomic data hinder research on it. Using illumina platform, transcriptome profiling of I. nil was performed through high-throughput sequencing, which was proven to be a rapid and cost-effective means to characterize gene content. Our goal is to use the resulting information to facilitate the relevant research on flowering and flower color formation in I. nil. In total, 268 million unique illumina RNA-Seq reads were produced and used in the transcriptome assembly. These reads were assembled into 220,117 contigs, of which 137,307 contigs were annotated using the GO and KEGG database. Based on the result of functional annotations, a total of 89,781 contigs were assigned 455,335 GO term annotations. Meanwhile, 17,418 contigs were identified with pathway annotation and they were functionally assigned to 144 KEGG pathways. Our transcriptome revealed at least 55 contigs as probably flowering-related genes in I. nil, and we also identified 25 contigs that encode key enzymes in the phenylpropanoid biosynthesis pathway. Based on the analysis relating to gene expression profiles, in the phenylpropanoid biosynthesis pathway of I. nil, the repression of lignin biosynthesis might lead to the redirection of the metabolic flux into anthocyanin biosynthesis. This may be the most likely reason that I. nil has high anthocyanins content, especially in its flowers. Additionally, 15,537 simple sequence repeats (SSRs) were detected using the MISA software, and these SSRs will undoubtedly benefit future breeding work. Moreover, the information uncovered in this study will also serve as a valuable resource for understanding the flowering and flower color formation mechanisms in I. nil.

  10. Cultivation of hard-to-culture subsurface mercury-resistant bacteria and discovery of new merA gene sequences

    DEFF Research Database (Denmark)

    Rasmussen, L D; Zawadsky, C; Binnerup, S J

    2008-01-01

    was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25......Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach...

  11. Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing

    Directory of Open Access Journals (Sweden)

    Power Deborah M

    2010-06-01

    Full Text Available Abstract Background The Antarctic clam, Laternula elliptica, is an infaunal stenothermal bivalve mollusc with a circumpolar distribution. It plays a significant role in bentho-pelagic coupling and hence has been proposed as a sentinel species for climate change monitoring. Previous studies have shown that this mollusc displays a high level of plasticity with regard to shell deposition and damage repair against a background of genetic homogeneity. The Southern Ocean has amongst the lowest present-day CaCO3 saturation rate of any ocean region, and is predicted to be among the first to become undersaturated under current ocean acidification scenarios. Hence, this species presents as an ideal candidate for studies into the processes of calcium regulation and shell deposition in our changing ocean environments. Results 454 sequencing of L. elliptica mantle tissue generated 18,290 contigs with an average size of 535 bp (ranging between 142 bp-5.591 kb. BLAST sequence similarity searching assigned putative function to 17% of the data set, with a significant proportion of these transcripts being involved in binding and potentially of a secretory nature, as defined by GO molecular function and biological process classifications. These results indicated that the mantle is a transcriptionally active tissue which is actively proliferating. All transcripts were screened against an in-house database of genes shown to be involved in extracellular matrix formation and calcium homeostasis in metazoans. Putative identifications were made for a number of classical shell deposition genes, such as tyrosinase, carbonic anhydrase and metalloprotease 1, along with novel members of the family 2 G-Protein Coupled Receptors (GPCRs. A membrane transport protein (SEC61 was also characterised and this demonstrated the utility of the clam sequence data as a resource for examining cold adapted amino acid substitutions. The sequence data contained 46,235 microsatellites and 13

  12. Porting Ordinary Applications to Blue Gene/Q Supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy; Katz, Daniel S.; Binkowski, T. Andrew; Zhong, Xiaoliang; Heinonen, Olle; Karpeyev, Dmitry; Wilde, Michael

    2015-08-31

    Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt's sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.

  13. Alphavirus vectors: applications for DNA vaccine production and gene expression.

    Science.gov (United States)

    Lundstrom, K

    2000-01-01

    Replication-deficient alphavirus vectors have been developed for efficient high-level transgene expression. The broad host range of alphaviruses has allowed infection of a wide variety of mammalian cell lines and primary cultures. Particularly, G protein-coupled receptors have been expressed at high levels and subjected to binding and functional studies. Expression in suspension cultures has greatly facilitated production of large quantities of recombinant proteins for structural studies. Injection of recombinant alphavirus vectors into rodent brain resulted in local reporter gene expression. Highly neuron-specific expression was obtained in hippocampal slice cultures in vivo. Additionally, preliminary studies in animal models suggest that alphavirus vectors can be attractive candidates for gene therapy applications. Traditionally alphavirus vectors, either attenuated strains or replication-deficient particles, have been used to elicit efficient immune responses in animals. Recently, the application of alphaviruses has been extended to naked nucleic acids. Injection of DNA as well as RNA vectors has demonstrated efficient antigen production. In many cases, protection against lethal challenges has been obtained after immunization with alphavirus particles or nucleic acid vectors. Alphavirus vectors can therefore be considered as potentially promising vectors for vaccine production.

  14. Automated discovery of tissue-targeting enhancers and transcription factors from binding motif and gene function data.

    Directory of Open Access Journals (Sweden)

    Geetu Tuteja

    2014-01-01

    Full Text Available Identifying enhancers regulating gene expression remains an important and challenging task. While recent sequencing-based methods provide epigenomic characteristics that correlate well with enhancer activity, it remains onerous to comprehensively identify all enhancers across development. Here we introduce a computational framework to identify tissue-specific enhancers evolving under purifying selection. First, we incorporate high-confidence binding site predictions with target gene functional enrichment analysis to identify transcription factors (TFs likely functioning in a particular context. We then search the genome for clusters of binding sites for these TFs, overcoming previous constraints associated with biased manual curation of TFs or enhancers. Applying our method to the placenta, we find 33 known and implicate 17 novel TFs in placental function, and discover 2,216 putative placenta enhancers. Using luciferase reporter assays, 31/36 (86% tested candidates drive activity in placental cells. Our predictions agree well with recent epigenomic data in human and mouse, yet over half our loci, including 7/8 (87% tested regions, are novel. Finally, we establish that our method is generalizable by applying it to 5 additional tissues: heart, pancreas, blood vessel, bone marrow, and liver.

  15. De novo characterization of the Dialeurodes citri transcriptome: mining genes involved in stress resistance and simple sequence repeats (SSRs) discovery.

    Science.gov (United States)

    Chen, E-H; Wei, D-D; Shen, G-M; Yuan, G-R; Bai, P-P; Wang, J-J

    2014-02-01

    The citrus whitefly, Dialeurodes citri (Ashmead), is one of the three economically important whitefly species that infest citrus plants around the world; however, limited genetic research has been focused on D. citri, partly because of lack of genomic resources. In this study, we performed de novo assembly of a transcriptome using Illumina paired-end sequencing technology (Illumina Inc., San Diego, CA, USA). In total, 36,766 unigenes with a mean length of 497 bp were identified. Of these unigenes, we identified 17,788 matched known proteins in the National Center for Biotechnology Information database, as determined by Blast search, with 5731, 4850 and 14,441 unigenes assigned to clusters of orthologous groups (COG), gene ontology (GO), and SwissProt, respectively. In total, 7507 unigenes were assigned to 308 known pathways. In-depth analysis of the data showed that 117 unigenes were identified as potentially involved in the detoxification of xenobiotics and 67 heat shock protein (Hsp) genes were associated with environmental stress. In addition, these enzymes were searched against the GO and COG database, and the results showed that the three major detoxification enzymes and Hsps were classified into 18 and 3, 6, and 8 annotations, respectively. In addition, 149 simple sequence repeats were detected. The results facilitate the investigation of molecular resistance mechanisms to insecticides and environmental stress, and contribute to molecular marker development. The findings greatly improve our genetic understanding of D. citri, and lay the foundation for future functional genomics studies on this species.

  16. Natural product discovery: past, present, and future.

    Science.gov (United States)

    Katz, Leonard; Baltz, Richard H

    2016-03-01

    Microorganisms have provided abundant sources of natural products which have been developed as commercial products for human medicine, animal health, and plant crop protection. In the early years of natural product discovery from microorganisms (The Golden Age), new antibiotics were found with relative ease from low-throughput fermentation and whole cell screening methods. Later, molecular genetic and medicinal chemistry approaches were applied to modify and improve the activities of important chemical scaffolds, and more sophisticated screening methods were directed at target disease states. In the 1990s, the pharmaceutical industry moved to high-throughput screening of synthetic chemical libraries against many potential therapeutic targets, including new targets identified from the human genome sequencing project, largely to the exclusion of natural products, and discovery rates dropped dramatically. Nonetheless, natural products continued to provide key scaffolds for drug development. In the current millennium, it was discovered from genome sequencing that microbes with large genomes have the capacity to produce about ten times as many secondary metabolites as was previously recognized. Indeed, the most gifted actinomycetes have the capacity to produce around 30-50 secondary metabolites. With the precipitous drop in cost for genome sequencing, it is now feasible to sequence thousands of actinomycete genomes to identify the "biosynthetic dark matter" as sources for the discovery of new and novel secondary metabolites. Advances in bioinformatics, mass spectrometry, proteomics, transcriptomics, metabolomics and gene expression are driving the new field of microbial genome mining for applications in natural product discovery and development.

  17. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  18. Whole cell strategies based on lux genes for high throughput applications toward new antimicrobials.

    Science.gov (United States)

    Galluzzi, Lorenzo; Karp, Matti

    2006-08-01

    The discovery/development of novel drug candidates has witnessed dramatic changes over the last two decades. Old methods to identify lead compounds are not suitable to screen wide libraries generated by combinatorial chemistry techniques. High throughput screening (HTS) has become irreplaceable and hundreds of different approaches have been described. Assays based on purified components are flanked by whole cell-based assays, in which reporter genes are used to monitor, directly or indirectly, the influence of a chemical over the metabolism of living cells. The most convenient and widely used reporters for real-time measurements are luciferases, light emitting enzymes from evolutionarily distant organisms. Autofluorescent proteins have been also extensively employed, but proved to be more suitable for end-point measurements, in situ applications - such as the localization of fusion proteins in specific subcellular compartments - or environmental studies on microbial populations. The trend toward miniaturization and the technical advances in detection and liquid handling systems will allow to reach an ultra high throughput screening (uHTS), with 100,000 of compounds routinely screened each day. Here we show how similar approaches may be applied also to the search for new and potent antimicrobial agents.

  19. First discovery of two polyketide synthase genes for mitorubrinic acid and mitorubrinol yellow pigment biosynthesis and implications in virulence of Penicillium marneffei.

    Directory of Open Access Journals (Sweden)

    Patrick C Y Woo

    Full Text Available BACKGROUND: The genome of P. marneffei, the most important thermal dimorphic fungus causing respiratory, skin and systemic mycosis in China and Southeast Asia, possesses 23 polyketide synthase (PKS genes and 2 polyketide synthase nonribosomal peptide synthase hybrid (PKS-NRPS genes, which is of high diversity compared to other thermal dimorphic pathogenic fungi. We hypothesized that the yellow pigment in the mold form of P. marneffei could also be synthesized by one or more PKS genes. METHODOLOGY/PRINCIPAL FINDINGS: All 23 PKS and 2 PKS-NRPS genes of P. marneffei were systematically knocked down. A loss of the yellow pigment was observed in the mold form of the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants. Sequence analysis showed that PKS11 and PKS12 are fungal non-reducing PKSs. Ultra high performance liquid chromatography-photodiode array detector/electrospray ionization-quadruple time of flight-mass spectrometry (MS and MS/MS analysis of the culture filtrates of wild type P. marneffei and the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants showed that the yellow pigment is composed of mitorubrinic acid and mitorubrinol. The survival of mice challenged with the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants was significantly better than those challenged with wild type P. marneffei (P<0.05. There was also statistically significant decrease in survival of pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants compared to wild type P. marneffei in both J774 and THP1 macrophages (P<0.05. CONCLUSIONS/SIGNIFICANCE: The yellow pigment of the mold form of P. marneffei is composed of mitorubrinol and mitorubrinic acid. This represents the first discovery of PKS genes responsible for mitorubrinol and mitorubrinic acid biosynthesis. pks12 and pks11 are probably responsible for sequential use in the biosynthesis of mitorubrinol and mitorubrinic acid

  20. MOLECULAR MODELING AND DRUG DISCOVERY OF POTENTIAL INHIBITORS FOR ANTICANCER TARGET GENE MELK (MATERNAL EMBRYONIC LEUCINE ZIPPER KINASE

    Directory of Open Access Journals (Sweden)

    Sabitha. K

    2011-12-01

    Full Text Available Maternal embryonic leucine zipper kinase (MELK, a member of the AMP serine/threonine kinase family, exhibits multiple features consistent with the potential utility of this gene as an anticancer target. Reports show that MELK functions as a cancer-specific protein kinase, and that down-regulation of MELK results in growth suppression of breast cancer cells. There are many inhibitors which bind to kinases and are in clinical trials too. In our study we have taken a library of different inhibitors and docked those using GLIDE Induced Fit. From docking result we can conclude that Syk inhibitor II, Rho kinase inhibitor IV, p38 MAP Kinase Inhibitor III, HA 1004, Dihydrochloride and IKK -2 inhibitor VI have good binding affinity towards MELK and may have anticancer activity.

  1. Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

    Science.gov (United States)

    Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

    2003-04-02

    Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.

  2. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    Directory of Open Access Journals (Sweden)

    Terao Keiji

    2002-12-01

    Full Text Available Abstract Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies.

  3. From amplification to gene in thyroid cancer: A high-resolution mapped bacterial-artificial-chromosome resource for cancer chromosome aberrations guides gene discovery after comparative genome hybridization

    Energy Technology Data Exchange (ETDEWEB)

    Chen, X.N.; Gonsky, R.; Korenberg, J.R. [UCLA School of Medicine, Los Angeles, CA (United States). Cedars-Sinai Research Inst.; Knauf, J.A.; Fagin, J.A. [Univ. of Cincinnati, OH (United States). Div. of Endocrinology/Metabolism; Wang, M.; Lai, E.H. [Univ. of North Carolina, Chapel Hill, NC (United States). Dept. of Pharmacology; Chissoe, S. [Washington Univ. School of Medicine, St. Louis, MO (United States). Genome Sequencing

    1998-08-01

    Chromosome rearrangements associated with neoplasms provide a rich resource for definition of the pathways of tumorigenesis. The power of comparative genome hybridization (CGH) to identify novel genes depends on the existence of suitable markers, which are lacking throughout most of the genome. The authors now report a general approach that translates CGH data into higher-resolution genomic-clone data that are then used to define the genes located in aneuploid regions. They used CGH to study 33 thyroid-tumor DNAs and two tumor-cell-line DNAs. The results revealed amplifications of chromosome band 2p21, with less-intense amplification on 2p13, 19q13.1, and 1p36 and with least-intense amplification on 1p34, 1q42, 5q31, 5q33-34, 9q32-34, and 14q32. To define the 2p21 region amplified, a dense array of 373 FISH-mapped chromosome 2 bacterial artificial chromosomes (BACs) was constructed, and 87 of these were hybridized to a tumor-cell line. Four BACs carried genomic DNA that was amplified in these cells. The maximum amplified region was narrowed to 3--6 Mb by multicolor FISH with the flanking BACs, and the minimum amplicon size was defined by a contig of 420 kb. Sequence analysis of the amplified BAC 1D9 revealed a fragment of the gene, encoding protein kinase C epsilon (PKC{epsilon}), that was then shown to be amplified and rearranged in tumor cells. In summary, CGH combined with a dense mapped resource of BACs and large-scale sequencing has led directly to the definition of PKC{epsilon} as a previously unmapped candidate gene involved in thyroid tumorigenesis.

  4. Applicability of STEM-RTG and High-Power SRG Power Systems to the Discovery and Scout Mission Capabilities Expansion (DSMCE) Study of ASRG-Based Missions

    Science.gov (United States)

    Colozza, Anthony J.; Cataldo, Robert L.

    2015-01-01

    This study looks at the applicability of utilizing the Segmented Thermoelectric Modular Radioisotope Thermoelectric Generator (STEM-RTG) or a high-power radioisotope generator to replace the Advanced Stirling Radioisotope Generator (ASRG), which had been identified as the baseline power system for a number of planetary exploration mission studies. Nine different Discovery-Class missions were examined to determine the applicability of either the STEM-RTG or the high-power SRG power systems in replacing the ASRG. The nine missions covered exploration across the solar system and included orbiting spacecraft, landers and rovers. Based on the evaluation a ranking of the applicability of each alternate power system to the proposed missions was made.

  5. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua: Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs.

    Directory of Open Access Journals (Sweden)

    Rune Andreassen

    Full Text Available Atlantic cod (Gadus morhua is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs.The discovery analysis revealed 490 mature miRNAs (401 unique sequences along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1-5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs.The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we have identified

  6. Cys-loop ligand-gated ion channel gene discovery in the Locusta migratoria manilensis through the neuron transcriptome.

    Science.gov (United States)

    Wang, Xin; Meng, Xiangkun; Liu, Chuanjun; Gao, Hongli; Zhang, Yixi; Liu, Zewen

    2015-05-01

    As an ideal model, Locusta migratoria manilensis (Meyen) has been widely used in the study of endocrinological and neurobiological processes. Here we created a large transcriptome of the locust neurons, which enriched ion channels whose potential for functional genetic experiments is currently limited. With high-throughput Illumina sequencing technology, we obtained more than 50 million raw reads, which were assembled into 61,056 unique sequences with average size of 737bp. Among the unigenes, a total 24,884 sequences had significant similarities with proteins in the five public databases (NR, SwissProt, GO, COG and KEGG) with a cut-off E-value of 10(-5) using BLASTx. Moreover, the number of potential genes of the cys-loop ligand-gated ion channels (LGICs) was manually curated, including 39 putative nicotinic acetylcholine receptors (nAChRs), 6 putative γ-aminobutyric acid (GABA) gated anion channels, 21 putative glutamate-gated chloride channels (GluCls) and 1 histamine-gated chloride channels (HisCls). In addition, the full-length of 11 nAChRs subunits (9 alpha and 2 beta) were obtained by RACE technique that would be helpful to further studies on nAChR neurochemistry and pharmacological aspects. To our knowledge, this is the first study to characterize the locust neuron transcriptome, which will provide a useful resource especially for future studies on the neuro-function and behavior of the locust.

  7. Discovery of potential new gene variants and inflammatory cytokine associations with fibromyalgia syndrome by whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Jinong Feng

    Full Text Available Fibromyalgia syndrome (FMS is a chronic musculoskeletal pain disorder affecting 2% to 5% of the general population. Both genetic and environmental factors may be involved. To ascertain in an unbiased manner which genes play a role in the disorder, we performed complete exome sequencing on a subset of FMS patients. Out of 150 nuclear families (trios DNA from 19 probands was subjected to complete exome sequencing. Since >80,000 SNPs were found per proband, the data were further filtered, including analysis of those with stop codons, a rare frequency (<2.5% in the 1000 Genomes database, and presence in at least 2/19 probands sequenced. Two nonsense mutations, W32X in C11orf40 and Q100X in ZNF77 among 150 FMS trios had a significantly elevated frequency of transmission to affected probands (p = 0.026 and p = 0.032, respectively and were present in a subset of 13% and 11% of FMS patients, respectively. Among 9 patients bearing more than one of the variants we have described, 4 had onset of symptoms between the ages of 10 and 18. The subset with the C11orf40 mutation had elevated plasma levels of the inflammatory cytokines, MCP-1 and IP-10, compared with unaffected controls or FMS patients with the wild-type allele. Similarly, patients with the ZNF77 mutation have elevated levels of the inflammatory cytokine, IL-12, compared with controls or patients with the wild type allele. Our results strongly implicate an inflammatory basis for FMS, as well as specific cytokine dysregulation, in at least 35% of our FMS cohort.

  8. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    Directory of Open Access Journals (Sweden)

    Victor Zeng

    Full Text Available Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects, representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket, a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in

  9. The U.S. Geological Survey Ecosystem Science Strategy, 2012-2022 - Advancing discovery and application through collaboration

    Science.gov (United States)

    Williams, Byron K.; Wingard, G. Lynn; Brewer, Gary; Cloern, James; Gelfenbaum, Guy; Jacobson, Robert B.; Kershner, Jeffrey L.; McGuire, Anthony David; Nichols, James D.; Shapiro, Carl D.; van Riper, Charles; White, Robin P.

    2012-01-01

    Ecosystem science is critical to making informed decisions about natural resources that can sustain our Nation's economic and environmental well-being. Resource managers and policy-makers are faced with countless decisions each year at local, state, tribal, territorial, and national levels on issues as diverse as renewable and non-renewable energy development, agriculture, forestry, water supply, and resource allocations at the urban-rural interface. The urgency for sound decision-making is increasing dramatically as the world is being transformed at an unprecedented pace and in uncertain directions. Environmental changes are associated with natural hazards, greenhouse gas emissions, and increasing demands for water, land, food, energy, mineral, and living resources. At risk is the Nation's environmental capital, the goods and services provided by resilient ecosystems that are vital to the health and well-being of human societies. Ecosystem science - the study of systems of organisms interacting with their environment and the consequences of natural and human-induced change on these systems - is necessary to inform decision-makers as they develop policies to adapt to these changes. This Ecosystems Science Strategy is built on a framework that includes basic and applied science. It highlights the critical roles that USGS scientists and partners can play in building scientific understanding and providing timely information to decision-makers. The strategy underscores the connection between scientific discoveries and the application of new knowledge. The strategy integrates ecosystem science and decision-making, producing new scientific outcomes to assist resource managers and providing public benefits. The USGS is uniquely positioned to play an important role in ecosystem science. With its wide range of expertise, the agency can bring holistic, cross-scale, interdisciplinary capabilities to the design and conduct of monitoring, research, and modeling and to new

  10. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant.

  11. Inverse bifurcation analysis: application to simple gene systems

    Directory of Open Access Journals (Sweden)

    Schuster Peter

    2006-07-01

    Full Text Available Abstract Background Bifurcation analysis has proven to be a powerful method for understanding the qualitative behavior of gene regulatory networks. In addition to the more traditional forward problem of determining the mapping from parameter space to the space of model behavior, the inverse problem of determining model parameters to result in certain desired properties of the bifurcation diagram provides an attractive methodology for addressing important biological problems. These include understanding how the robustness of qualitative behavior arises from system design as well as providing a way to engineer biological networks with qualitative properties. Results We demonstrate that certain inverse bifurcation problems of biological interest may be cast as optimization problems involving minimal distances of reference parameter sets to bifurcation manifolds. This formulation allows for an iterative solution procedure based on performing a sequence of eigen-system computations and one-parameter continuations of solutions, the latter being a standard capability in existing numerical bifurcation software. As applications of the proposed method, we show that the problem of maximizing regions of a given qualitative behavior as well as the reverse engineering of bistable gene switches can be modelled and efficiently solved.

  12. RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome.

    Science.gov (United States)

    Thongjuea, Supat; Ruanjaichon, Vinitchan; Bruskiewich, Richard; Vanavichit, Apichart

    2009-01-01

    RiceGeneThresher is a public online resource for mining genes underlying genome regions of interest or quantitative trait loci (QTL) in rice genome. It is a compendium of rice genomic resources consisting of genetic markers, genome annotation, expressed sequence tags (ESTs), protein domains, gene ontology, plant stress-responsive genes, metabolic pathways and prediction of protein-protein interactions. RiceGeneThresher system integrates these diverse data sources and provides powerful web-based applications, and flexible tools for delivering customized set of biological data on rice. Its system supports whole-genome gene mining for QTL by querying using DNA marker intervals or genomic loci. RiceGeneThresher provides biologically supported evidences that are essential for targeting groups or networks of genes involved in controlling traits underlying QTL. Users can use it to discover and to assign the most promising candidate genes in preparation for the further gene function validation analysis. The web-based application is freely available at http://rice.kps.ku.ac.th.

  13. The web server of IBM's Bioinformatics and Pattern Discovery group.

    Science.gov (United States)

    Huynh, Tien; Rigoutsos, Isidore; Parida, Laxmi; Platt, Daniel; Shibuya, Tetsuo

    2003-07-01

    We herein present and discuss the services and content which are available on the web server of IBM's Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the group's members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

  14. Development of urinary pseudotargeted LC-MS-based metabolomics method and its application in hepatocellular carcinoma biomarker discovery.

    Science.gov (United States)

    Shao, Yaping; Zhu, Bin; Zheng, Ruiyin; Zhao, Xinjie; Yin, Peiyuan; Lu, Xin; Jiao, Binghua; Xu, Guowang; Yao, Zhenzhen

    2015-02-01

    Hepatocellular carcinoma (HCC) is one of the pestilent malignancies leading to cancer-related death. Discovering effective biomarkers for HCC diagnosis is an urgent demand. To identify potential metabolite biomarkers, we developed a urinary pseudotargeted method based on liquid chromatography-hybrid triple quadrupole linear ion trap mass spectrometry (LC-QTRAP MS). Compared with nontargeted method, the pseudotargeted method can achieve better data quality, which benefits differential metabolites discovery. The established method was applied to cirrhosis (CIR) and HCC investigation. It was found that urinary nucleosides, bile acids, citric acid, and several amino acids were significantly changed in liver disease groups compared with the controls, featuring the dysregulation of purine metabolism, energy metabolism, and amino metabolism in liver diseases. Furthermore, some metabolites such as cyclic adenosine monophosphate, glutamine, and short- and medium-chain acylcarnitines were the differential metabolites of HCC and CIR. On the basis of binary logistic regression, butyrylcarnitine (carnitine C4:0) and hydantoin-5-propionic acid were defined as combinational markers to distinguish HCC from CIR. The area under curve was 0.786 and 0.773 for discovery stage and validation stage samples, respectively. These data show that the established pseudotargeted method is a complementary one of targeted and nontargeted methods for metabolomics study.

  15. Discovery through Gossip

    CERN Document Server

    Haeupler, Bernhard; Peleg, David; Rajaraman, Rajmohan; Sun, Zhifeng

    2012-01-01

    We study randomized gossip-based processes in dynamic networks that are motivated by discovery processes in large-scale distributed networks like peer-to-peer or social networks. A well-studied problem in peer-to-peer networks is the resource discovery problem. There, the goal for nodes (hosts with IP addresses) is to discover the IP addresses of all other hosts. In social networks, nodes (people) discover new nodes through exchanging contacts with their neighbors (friends). In both cases the discovery of new nodes changes the underlying network - new edges are added to the network - and the process continues in the changed network. Rigorously analyzing such dynamic (stochastic) processes with a continuously self-changing topology remains a challenging problem with obvious applications. This paper studies and analyzes two natural gossip-based discovery processes. In the push process, each node repeatedly chooses two random neighbors and puts them in contact (i.e., "pushes" their mutual information to each oth...

  16. Novel Gene Discovery of Crops in China: Status, Challenging, and Perspective%中国作物新基因发掘:现状、挑战与展望

    Institute of Scientific and Technical Information of China (English)

    邱丽娟; 王建康; 万建民; 郭勇; 黎裕; 王晓波; 周国安; 刘章雄; 周时荣; 李新海; 马有志

    2011-01-01

    作物新基因发掘是实现作物种质资源向基因资源转变和作物分子育种的基础.本文对中国水稻、小麦、玉米、大豆、棉花和油菜等主要作物基因发掘研究进展进行了分析和评述,总结出近10年来中国科学家在作物基因发掘研究领域取得的突破性进展,包括:(1)创制出一批具有特色的基因发掘材料,包括基于中国作物遗传多样性的核心种质、基于优异资源的遗传分离群体和基于人工诱变的突变体等;(2)基因发掘技术和方法有所突破,尤其是建市了针对不同基因特点整合各种技术的基因发掘技术、改进了基因/QTL的生物统计算法等,提高了基因发掘的效率;(3)作物重要性状基因/QTL的标记定位已成为作物常规遗传研究方法,初步定位了一批抗病虫、抗逆、优质、养分高效、高产相关基因/QTL,其中,有500多个基因已精细定位;(4)以水稻为代表的作物基因克隆及功能研究在国际上受到瞩目,在主要作物中已克隆了300多个基因,其中,在目标作物中验证的重要性状基因数超过70个.目前,国际作物基因发掘正朝高效化、规模化及实用化方向发展,中国作物基因发掘也在这些方面有所创新.然而,与国际作物基因发掘研究相比还存在差距,中国作物基因发掘的数量和质量还远远不能满足作物分子育种的需求,具体表现为不同作物基因发掘研究进展不平衡、发掘基因的数量还相对有限、已发掘的基因中具有蕈大利用价值的基因不多等.针对中国基因发掘面临的问题和世界各国以及跨国生物技术公司争夺基因的巨大挑战,作者提出了中国作物基因发掘应重点提高基因发掘效率,开展重要基因克隆及基因的价值评估,加强以生物产业发展需求为导向的基因发掘策略.%Discovery of novel genes in crops is the basis to change germplasm resources from phenotypical characterization to

  17. Deep Learning in Drug Discovery.

    Science.gov (United States)

    Gawehn, Erik; Hiss, Jan A; Schneider, Gisbert

    2016-01-01

    Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing from the field of "deep learning". Compared with some of the other life sciences, their application in drug discovery is still limited. Here, we provide an overview of this emerging field of molecular informatics, present the basic concepts of prominent deep learning methods and offer motivation to explore these techniques for their usefulness in computer-assisted drug discovery and design. We specifically emphasize deep neural networks, restricted Boltzmann machine networks and convolutional networks.

  18. GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics

    Science.gov (United States)

    Piovesan, Allison; Caracausi, Maria; Antonaros, Francesca; Pelleri, Maria Chiara; Vitale, Lorenza

    2016-01-01

    We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a ‘mean’ human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves. Database URL: http://apollo11.isto.unibo.it/software/ PMID:28025344

  19. GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics.

    Science.gov (United States)

    Piovesan, Allison; Caracausi, Maria; Antonaros, Francesca; Pelleri, Maria Chiara; Vitale, Lorenza

    2016-01-01

    We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a 'mean' human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves.Database URL: http://apollo11.isto.unibo.it/software/.

  20. An interaction network predicted from public data as a discovery tool: application to the Hsp90 molecular chaperone machine.

    Directory of Open Access Journals (Sweden)

    Pablo C Echeverría

    Full Text Available Understanding the functions of proteins requires information about their protein-protein interactions (PPI. The collective effort of the scientific community generates far more data on any given protein than individual experimental approaches. The latter are often too limited to reveal an interactome comprehensively. We developed a workflow for parallel mining of all major PPI databases, containing data from several model organisms, and to integrate data from the literature for a protein of interest. We applied this novel approach to build the PPI network of the human Hsp90 molecular chaperone machine (Hsp90Int for which previous efforts have yielded limited and poorly overlapping sets of interactors. We demonstrate the power of the Hsp90Int database as a discovery tool by validating the prediction that the Hsp90 co-chaperone Aha1 is involved in nucleocytoplasmic transport. Thus, we both describe how to build a custom database and introduce a powerful new resource for the scientific community.

  1. Increasing pharmacological knowledge about human neurological and psychiatric disorders through functional neuroimaging and its application in drug discovery.

    Science.gov (United States)

    Nathan, Pradeep J; Phan, K Luan; Harmer, Catherine J; Mehta, Mitul A; Bullmore, Edward T

    2014-02-01

    Functional imaging methods such as fMRI have been widely used to gain greater understanding of brain circuitry abnormalities in CNS disorders and their underlying neurochemical basis. Findings suggest that: (1) drugs with known clinical efficacy have consistent effects on disease relevant brain circuitry, (2) brain activation changes at baseline or early drug effects on brain activity can predict long-term efficacy; and (3) fMRI together with pharmacological challenges could serve as experimental models of disease phenotypes and be used for screening novel drugs. Together, these observations suggest that drug related modulation of disease relevant brain circuitry may serve as a promising biomarker/method for use in drug discovery to demonstrate target engagement, differential efficacy, dose-response relationships, and prediction of clinically relevant changes.

  2. Application of high-throughput affinity-selection mass spectrometry for screening of chemical compound libraries in lead discovery.

    Science.gov (United States)

    Zehender, Hartmut; Mayr, Lorenz M

    2007-02-01

    High-throughput screening of chemical libraries for compounds that interfere with a particular molecular target is among the most powerful methodologies applied in lead discovery at present. In this review, the authors describe a label-free, homogeneous, affinity-selection-based technology developed at Novartis, termed SpeedScreen, which is compared with similar technologies used for high-throughput screening in the pharmaceutical and biotechnology industries. The focus at present of SpeedScreen is twofold: first, this technology is applied to orphan genomic targets and to those targets that are non-tractable by a functional assay; second, this technology is applied complementary to the well-established traditional methodologies for the screening of molecular targets. In summary, the authors discuss the value of affinity-selection-based high-throughput screening as a complementary technology to the common functional screening platforms and the benefits as well as the limitations of this new technology are outlined.

  3. Application of an Efficient Gene Targeting System Linking Secondary Metabolites to their Biosynthetic Genes in Aspergillus terreus

    Energy Technology Data Exchange (ETDEWEB)

    Guo, Chun-Jun; Knox, Benjamin P.; Sanchez, James F.; Chiang, Yi-Ming; Bruno, Kenneth S.; Wang, Clay C.

    2013-07-19

    Nonribosomal peptides (NRPs) are natural products biosynthesized by NRP synthetases. A kusA-, pyrG- mutant strain of Aspergillusterreus NIH 2624 was developed that greatly facilitated the gene targeting efficiency in this organism. Application of this tool allowed us to link four major types of NRP related secondary metabolites to their responsible genes in A. terreus. In addition, an NRP related melanin synthetase was also identified in this species.

  4. Applications of molecular docking in drug discovery%分子对接技术在新药研发领域中的应用进展

    Institute of Scientific and Technical Information of China (English)

    乔会晶; 戴子茹; 葛广波; 徐少贤; 杨凌

    2015-01-01

    Computer-aided drug design(CADD) has been officially developed as a novel technology since 1970s. Molecular docking technology, as one of the main method, has been widely used in many fields of drug develop-ment. In this paper, we not only introduce the basic principle, common methods and common software of molec-ular docking, but also highlight the applications of molecular docking in the new drug research which includes the virtual screening in the early stage of drug discovery, the discovery of drug targets, study of the potential mechanism and the prediction of drug metabolism site. In addition, we forecast the potential applications of mo-lecular docking technology in the field of new drug research.%20世纪70年代,计算机辅助药物设计( Computer-Aided Drug Design, CADD)正式发展为一门新兴技术。分子对接技术( Molecular Docking)作为计算机辅助药物设计的一种主要方法,目前已被广泛应用于新药研发的多个环节。本文不仅介绍了分子对接的基本原理、分子对接常用方法和常用软件,还着重介绍了分子对接在新药研发中的具体应用,包括药物发现阶段的早期虚拟筛选、药物作用靶点发现、药物潜在作用机制研究以及药物代谢位点的预测。此外,还对分子对接技术在新药研发领域的应用前景进行展望。

  5. Investigation of matrix effects in bioanalytical high-performance liquid chromatography/tandem mass spectrometric assays: application to drug discovery.

    Science.gov (United States)

    Mei, Hong; Hsieh, Yunsheng; Nardo, Cymbylene; Xu, Xiaoying