WorldWideScience

Sample records for comprehensive bioinformatics analysis

  1. Comprehensive decision tree models in bioinformatics.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available PURPOSE: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. METHODS: This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. RESULTS: The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. CONCLUSIONS: The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets

  2. Comprehensive decision tree models in bioinformatics.

    Science.gov (United States)

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  3. In-depth analysis of the adipocyte proteome by mass spectrometry and bioinformatics

    DEFF Research Database (Denmark)

    Adachi, Jun; Kumar, Chanchal; Zhang, Yanling

    2007-01-01

    , mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized...

  4. COMAN: a web server for comprehensive metatranscriptomics analysis.

    Science.gov (United States)

    Ni, Yueqiong; Li, Jun; Panagiotou, Gianni

    2016-08-11

    Microbiota-oriented studies based on metagenomic or metatranscriptomic sequencing have revolutionised our understanding on microbial ecology and the roles of both clinical and environmental microbes. The analysis of massive metatranscriptomic data requires extensive computational resources, a collection of bioinformatics tools and expertise in programming. We developed COMAN (Comprehensive Metatranscriptomics Analysis), a web-based tool dedicated to automatically and comprehensively analysing metatranscriptomic data. COMAN pipeline includes quality control of raw reads, removal of reads derived from non-coding RNA, followed by functional annotation, comparative statistical analysis, pathway enrichment analysis, co-expression network analysis and high-quality visualisation. The essential data generated by COMAN are also provided in tabular format for additional analysis and integration with other software. The web server has an easy-to-use interface and detailed instructions, and is freely available at http://sbb.hku.hk/COMAN/ CONCLUSIONS: COMAN is an integrated web server dedicated to comprehensive functional analysis of metatranscriptomic data, translating massive amount of reads to data tables and high-standard figures. It is expected to facilitate the researchers with less expertise in bioinformatics in answering microbiota-related biological questions and to increase the accessibility and interpretation of microbiota RNA-Seq data.

  5. LXtoo: an integrated live Linux distribution for the bioinformatics community.

    Science.gov (United States)

    Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu

    2012-07-19

    Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.

  6. A Brief Review of Bioinformatics Tools for Glycosylation Analysis by Mass Spectrometry

    OpenAIRE

    Tsai, Pei-Lun; Chen, Sung-Fang

    2017-01-01

    The purpose of this review is to provide updated information regarding bioinformatic software for the use in the characterization of glycosylated structures since 2013. A comprehensive review by Woodin et al. Analyst 138: 2793?2803, 2013 (ref. 1) described two main approaches that are introduced for starting researchers in this area; analysis of released glycans and the identification of glycopeptide in enzymatic digests, respectively. Complementary to that report, this review focuses on m...

  7. Secretome Analysis of Lipid-Induced Insulin Resistance in Skeletal Muscle Cells by a Combined Experimental and Bioinformatics Workflow

    DEFF Research Database (Denmark)

    Deshmukh, Atul S; Cox, Juergen; Jensen, Lars Juhl

    2015-01-01

    , in principle, allows an unbiased and comprehensive analysis of cellular secretomes; however, the distinction of bona fide secreted proteins from proteins released upon lysis of a small fraction of dying cells remains challenging. Here we applied highly sensitive MS and streamlined bioinformatics to analyze......-resistant conditions. Our study demonstrates an efficient combined experimental and bioinformatics workflow to identify putative secreted proteins from insulin-resistant skeletal muscle cells, which could easily be adapted to other cellular models....

  8. Mathematics and evolutionary biology make bioinformatics education comprehensible

    Science.gov (United States)

    Weisstein, Anton E.

    2013-01-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes—the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software—the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a ‘two-culture’ problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses. PMID:23821621

  9. Mathematics and evolutionary biology make bioinformatics education comprehensible.

    Science.gov (United States)

    Jungck, John R; Weisstein, Anton E

    2013-09-01

    The patterns of variation within a molecular sequence data set result from the interplay between population genetic, molecular evolutionary and macroevolutionary processes-the standard purview of evolutionary biologists. Elucidating these patterns, particularly for large data sets, requires an understanding of the structure, assumptions and limitations of the algorithms used by bioinformatics software-the domain of mathematicians and computer scientists. As a result, bioinformatics often suffers a 'two-culture' problem because of the lack of broad overlapping expertise between these two groups. Collaboration among specialists in different fields has greatly mitigated this problem among active bioinformaticians. However, science education researchers report that much of bioinformatics education does little to bridge the cultural divide, the curriculum too focused on solving narrow problems (e.g. interpreting pre-built phylogenetic trees) rather than on exploring broader ones (e.g. exploring alternative phylogenetic strategies for different kinds of data sets). Herein, we present an introduction to the mathematics of tree enumeration, tree construction, split decomposition and sequence alignment. We also introduce off-line downloadable software tools developed by the BioQUEST Curriculum Consortium to help students learn how to interpret and critically evaluate the results of standard bioinformatics analyses.

  10. BioGPS descriptors for rational engineering of enzyme promiscuity and structure based bioinformatic analysis.

    Directory of Open Access Journals (Sweden)

    Valerio Ferrario

    Full Text Available A new bioinformatic methodology was developed founded on the Unsupervised Pattern Cognition Analysis of GRID-based BioGPS descriptors (Global Positioning System in Biological Space. The procedure relies entirely on three-dimensional structure analysis of enzymes and does not stem from sequence or structure alignment. The BioGPS descriptors account for chemical, geometrical and physical-chemical features of enzymes and are able to describe comprehensively the active site of enzymes in terms of "pre-organized environment" able to stabilize the transition state of a given reaction. The efficiency of this new bioinformatic strategy was demonstrated by the consistent clustering of four different Ser hydrolases classes, which are characterized by the same active site organization but able to catalyze different reactions. The method was validated by considering, as a case study, the engineering of amidase activity into the scaffold of a lipase. The BioGPS tool predicted correctly the properties of lipase variants, as demonstrated by the projection of mutants inside the BioGPS "roadmap".

  11. Deep learning in bioinformatics.

    Science.gov (United States)

    Min, Seonwoo; Lee, Byunghan; Yoon, Sungroh

    2017-09-01

    In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e. omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e. deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  12. Fundamentals of bioinformatics and computational biology methods and exercises in matlab

    CERN Document Server

    Singh, Gautam B

    2015-01-01

    This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolbox™. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence compar...

  13. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis.

    Science.gov (United States)

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.

  14. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    Science.gov (United States)

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent

  15. GProX, a User-Friendly Platform for Bioinformatics Analysis and Visualization of Quantitative Proteomics Data

    DEFF Research Database (Denmark)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-01-01

    -friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface...... such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical...... displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics...

  16. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  17. 5th HUPO BPP Bioinformatics Meeting at the European Bioinformatics Institute in Hinxton, UK--Setting the analysis frame.

    Science.gov (United States)

    Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf

    2005-09-01

    The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.

  18. Buying in to bioinformatics: an introduction to commercial sequence analysis software.

    Science.gov (United States)

    Smith, David Roy

    2015-07-01

    Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. © The Author 2014. Published by Oxford University Press.

  19. Introduction to bioinformatics.

    Science.gov (United States)

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  20. GProX, a user-friendly platform for bioinformatics analysis and visualization of quantitative proteomics data.

    Science.gov (United States)

    Rigbolt, Kristoffer T G; Vanselow, Jens T; Blagoev, Blagoy

    2011-08-01

    Recent technological advances have made it possible to identify and quantify thousands of proteins in a single proteomics experiment. As a result of these developments, the analysis of data has become the bottleneck of proteomics experiment. To provide the proteomics community with a user-friendly platform for comprehensive analysis, inspection and visualization of quantitative proteomics data we developed the Graphical Proteomics Data Explorer (GProX)(1). The program requires no special bioinformatics training, as all functions of GProX are accessible within its graphical user-friendly interface which will be intuitive to most users. Basic features facilitate the uncomplicated management and organization of large data sets and complex experimental setups as well as the inspection and graphical plotting of quantitative data. These are complemented by readily available high-level analysis options such as database querying, clustering based on abundance ratios, feature enrichment tests for e.g. GO terms and pathway analysis tools. A number of plotting options for visualization of quantitative proteomics data is available and most analysis functions in GProX create customizable high quality graphical displays in both vector and bitmap formats. The generic import requirements allow data originating from essentially all mass spectrometry platforms, quantitation strategies and software to be analyzed in the program. GProX represents a powerful approach to proteomics data analysis providing proteomics experimenters with a toolbox for bioinformatics analysis of quantitative proteomics data. The program is released as open-source and can be freely downloaded from the project webpage at http://gprox.sourceforge.net.

  1. Bioinformatics and Microarray Data Analysis on the Cloud.

    Science.gov (United States)

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  2. PinAPL-Py: A comprehensive web-application for the analysis of CRISPR/Cas9 screens.

    Science.gov (United States)

    Spahn, Philipp N; Bath, Tyler; Weiss, Ryan J; Kim, Jihoon; Esko, Jeffrey D; Lewis, Nathan E; Harismendy, Olivier

    2017-11-20

    Large-scale genetic screens using CRISPR/Cas9 technology have emerged as a major tool for functional genomics. With its increased popularity, experimental biologists frequently acquire large sequencing datasets for which they often do not have an easy analysis option. While a few bioinformatic tools have been developed for this purpose, their utility is still hindered either due to limited functionality or the requirement of bioinformatic expertise. To make sequencing data analysis of CRISPR/Cas9 screens more accessible to a wide range of scientists, we developed a Platform-independent Analysis of Pooled Screens using Python (PinAPL-Py), which is operated as an intuitive web-service. PinAPL-Py implements state-of-the-art tools and statistical models, assembled in a comprehensive workflow covering sequence quality control, automated sgRNA sequence extraction, alignment, sgRNA enrichment/depletion analysis and gene ranking. The workflow is set up to use a variety of popular sgRNA libraries as well as custom libraries that can be easily uploaded. Various analysis options are offered, suitable to analyze a large variety of CRISPR/Cas9 screening experiments. Analysis output includes ranked lists of sgRNAs and genes, and publication-ready plots. PinAPL-Py helps to advance genome-wide screening efforts by combining comprehensive functionality with user-friendly implementation. PinAPL-Py is freely accessible at http://pinapl-py.ucsd.edu with instructions and test datasets.

  3. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator

    Directory of Open Access Journals (Sweden)

    Thoraval Samuel

    2005-04-01

    Full Text Available Abstract Background Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. Results We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results can be reproduced or shared among users. Availability: http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive, ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download. Conclusion From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous

  4. Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.

    Science.gov (United States)

    Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W

    2018-01-01

    Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

    Science.gov (United States)

    Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

    2016-11-01

    In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.

  6. COMPARISON OF POPULAR BIOINFORMATICS DATABASES

    OpenAIRE

    Abdulganiyu Abdu Yusuf; Zahraddeen Sufyanu; Kabir Yusuf Mamman; Abubakar Umar Suleiman

    2016-01-01

    Bioinformatics is the application of computational tools to capture and interpret biological data. It has wide applications in drug development, crop improvement, agricultural biotechnology and forensic DNA analysis. There are various databases available to researchers in bioinformatics. These databases are customized for a specific need and are ranged in size, scope, and purpose. The main drawbacks of bioinformatics databases include redundant information, constant change, data spread over m...

  7. Bioinformatics

    DEFF Research Database (Denmark)

    Baldi, Pierre; Brunak, Søren

    , and medicine will be particularly affected by the new results and the increased understanding of life at the molecular level. Bioinformatics is the development and application of computer methods for analysis, interpretation, and prediction, as well as for the design of experiments. It has emerged...

  8. Taking Bioinformatics to Systems Medicine.

    Science.gov (United States)

    van Kampen, Antoine H C; Moerland, Perry D

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.

  9. Analysis of requirements for teaching materials based on the course bioinformatics for plant metabolism

    Science.gov (United States)

    Balqis, Widodo, Lukiati, Betty; Amin, Mohamad

    2017-05-01

    A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.

  10. Rapid cloning and bioinformatic analysis of spinach Y chromosome ...

    Indian Academy of Sciences (India)

    Rapid cloning and bioinformatic analysis of spinach Y chromosome- specific EST sequences. Chuan-Liang Deng, Wei-Li Zhang, Ying Cao, Shao-Jing Wang, ... Arabidopsis thaliana mRNA for mitochondrial half-ABC transporter (STA1 gene). 389 2.31E-13. 98.96. SP3−12. Betula pendula histidine kinase 3 (HK3) mRNA, ...

  11. The Revolution in Viral Genomics as Exemplified by the Bioinformatic Analysis of Human Adenoviruses

    Directory of Open Access Journals (Sweden)

    Sarah Torres

    2010-06-01

    Full Text Available Over the past 30 years, genomic and bioinformatic analysis of human adenoviruses has been achieved using a variety of DNA sequencing methods; initially with the use of restriction enzymes and more currently with the use of the GS FLX pyrosequencing technology. Following the conception of DNA sequencing in the 1970s, analysis of adenoviruses has evolved from 100 base pair mRNA fragments to entire genomes. Comparative genomics of adenoviruses made its debut in 1984 when nucleotides and amino acids of coding sequences within the hexon genes of two human adenoviruses (HAdV, HAdV–C2 and HAdV–C5, were compared and analyzed. It was determined that there were three different zones (1-393, 394-1410, 1411-2910 within the hexon gene, of which HAdV–C2 and HAdV–C5 shared zones 1 and 3 with 95% and 89.5% nucleotide identity, respectively. In 1992, HAdV-C5 became the first adenovirus genome to be fully sequenced using the Sanger method. Over the next seven years, whole genome analysis and characterization was completed using bioinformatic tools such as blastn, tblastx, ClustalV and FASTA, in order to determine key proteins in species HAdV-A through HAdV-F. The bioinformatic revolution was initiated with the introduction of a novel species, HAdV-G, that was typed and named by the use of whole genome sequencing and phylogenetics as opposed to traditional serology. HAdV bioinformatics will continue to advance as the latest sequencing technology enables scientists to add to and expand the resource databases. As a result of these advancements, how novel HAdVs are typed has changed. Bioinformatic analysis has become the revolutionary tool that has significantly accelerated the in-depth study of HAdV microevolution through comparative genomics.

  12. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function.

    Science.gov (United States)

    Cheng, Liang; Hu, Yang; Sun, Jie; Zhou, Meng; Jiang, Qinghua

    2018-06-01

    DincRNA aims to provide a comprehensive web-based bioinformatics toolkit to elucidate the entangled relationships among diseases and non-coding RNAs (ncRNAs) from the perspective of disease similarity. The quantitative way to illustrate relationships of pair-wise diseases always depends on their molecular mechanisms, and structures of the directed acyclic graph of Disease Ontology (DO). Corresponding methods for calculating similarity of pair-wise diseases involve Resnik's, Lin's, Wang's, PSB and SemFunSim methods. Recently, disease similarity was validated suitable for calculating functional similarities of ncRNAs and prioritizing ncRNA-disease pairs, and it has been widely applied for predicting the ncRNA function due to the limited biological knowledge from wet lab experiments of these RNAs. For this purpose, a large number of algorithms and priori knowledge need to be integrated. e.g. 'pair-wise best, pairs-average' (PBPA) and 'pair-wise all, pairs-maximum' (PAPM) methods for calculating functional similarities of ncRNAs, and random walk with restart (RWR) method for prioritizing ncRNA-disease pairs. To facilitate the exploration of disease associations and ncRNA function, DincRNA implemented all of the above eight algorithms based on DO and disease-related genes. Currently, it provides the function to query disease similarity scores, miRNA and lncRNA functional similarity scores, and the prioritization scores of lncRNA-disease and miRNA-disease pairs. http://bio-annotation.cn:18080/DincRNAClient/. biofomeng@hotmail.com or qhjiang@hit.edu.cn. Supplementary data are available at Bioinformatics online.

  13. Bioinformatics for cancer immunotherapy target discovery

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Campos, Benito; Barnkob, Mike Stein

    2014-01-01

    therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes...

  14. Bioinformatics analysis of Brucella vaccines and vaccine targets using VIOLIN.

    Science.gov (United States)

    He, Yongqun; Xiang, Zuoshuang

    2010-09-27

    Brucella spp. are Gram-negative, facultative intracellular bacteria that cause brucellosis, one of the commonest zoonotic diseases found worldwide in humans and a variety of animal species. While several animal vaccines are available, there is no effective and safe vaccine for prevention of brucellosis in humans. VIOLIN (http://www.violinet.org) is a web-based vaccine database and analysis system that curates, stores, and analyzes published data of commercialized vaccines, and vaccines in clinical trials or in research. VIOLIN contains information for 454 vaccines or vaccine candidates for 73 pathogens. VIOLIN also contains many bioinformatics tools for vaccine data analysis, data integration, and vaccine target prediction. To demonstrate the applicability of VIOLIN for vaccine research, VIOLIN was used for bioinformatics analysis of existing Brucella vaccines and prediction of new Brucella vaccine targets. VIOLIN contains many literature mining programs (e.g., Vaxmesh) that provide in-depth analysis of Brucella vaccine literature. As a result of manual literature curation, VIOLIN contains information for 38 Brucella vaccines or vaccine candidates, 14 protective Brucella antigens, and 68 host response studies to Brucella vaccines from 97 peer-reviewed articles. These Brucella vaccines are classified in the Vaccine Ontology (VO) system and used for different ontological applications. The web-based VIOLIN vaccine target prediction program Vaxign was used to predict new Brucella vaccine targets. Vaxign identified 14 outer membrane proteins that are conserved in six virulent strains from B. abortus, B. melitensis, and B. suis that are pathogenic in humans. Of the 14 membrane proteins, two proteins (Omp2b and Omp31-1) are not present in B. ovis, a Brucella species that is not pathogenic in humans. Brucella vaccine data stored in VIOLIN were compared and analyzed using the VIOLIN query system. Bioinformatics curation and ontological representation of Brucella vaccines

  15. Bioinformatics clouds for big data manipulation.

    Science.gov (United States)

    Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  16. Application of machine learning methods in bioinformatics

    Science.gov (United States)

    Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

    2018-05-01

    Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.

  17. Development of Bioinformatics Infrastructure for Genomics Research.

    Science.gov (United States)

    Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem

    2017-06-01

    Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for

  18. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    Science.gov (United States)

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  19. Bioinformatics clouds for big data manipulation

    Directory of Open Access Journals (Sweden)

    Dai Lin

    2012-11-01

    Full Text Available Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS, Software as a Service (SaaS, Platform as a Service (PaaS, and Infrastructure as a Service (IaaS, and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.

  20. Bioinformatics clouds for big data manipulation

    KAUST Repository

    Dai, Lin

    2012-11-28

    As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics.This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. 2012 Dai et al.; licensee BioMed Central Ltd.

  1. Generalized Centroid Estimators in Bioinformatics

    Science.gov (United States)

    Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi

    2011-01-01

    In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017

  2. Swabs to genomes: a comprehensive workflow

    Directory of Open Access Journals (Sweden)

    Madison I. Dunitz

    2015-05-01

    Full Text Available The sequencing, assembly, and basic analysis of microbial genomes, once a painstaking and expensive undertaking, has become much easier for research labs with access to standard molecular biology and computational tools. However, there are a confusing variety of options available for DNA library preparation and sequencing, and inexperience with bioinformatics can pose a significant barrier to entry for many who may be interested in microbial genomics. The objective of the present study was to design, test, troubleshoot, and publish a simple, comprehensive workflow from the collection of an environmental sample (a swab to a published microbial genome; empowering even a lab or classroom with limited resources and bioinformatics experience to perform it.

  3. Open source tools and toolkits for bioinformatics: significance, and where are we?

    Science.gov (United States)

    Stajich, Jason E; Lapp, Hilmar

    2006-09-01

    This review summarizes important work in open-source bioinformatics software that has occurred over the past couple of years. The survey is intended to illustrate how programs and toolkits whose source code has been developed or released under an Open Source license have changed informatics-heavy areas of life science research. Rather than creating a comprehensive list of all tools developed over the last 2-3 years, we use a few selected projects encompassing toolkit libraries, analysis tools, data analysis environments and interoperability standards to show how freely available and modifiable open-source software can serve as the foundation for building important applications, analysis workflows and resources.

  4. iSmaRT: a toolkit for a comprehensive analysis of small RNA-Seq data.

    Science.gov (United States)

    Panero, Riccardo; Rinaldi, Antonio; Memoli, Domenico; Nassa, Giovanni; Ravo, Maria; Rizzo, Francesca; Tarallo, Roberta; Milanesi, Luciano; Weisz, Alessandro; Giurato, Giorgio

    2017-03-15

    The interest in investigating the biological roles of small non-coding RNAs (sncRNAs) is increasing, due to the pleiotropic effects of these molecules exert in many biological contexts. While several methods and tools are available to study microRNAs (miRNAs), only few focus on novel classes of sncRNAs, in particular PIWI-interacting RNAs (piRNAs). To overcome these limitations, we implemented iSmaRT ( i ntegrative Sm all R NA T ool-kit), an automated pipeline to analyze smallRNA-Seq data. iSmaRT is a collection of bioinformatics tools and own algorithms, interconnected through a Graphical User Interface (GUI). In addition to performing comprehensive analyses on miRNAs, it implements specific computational modules to analyze piRNAs, predicting novel ones and identifying their RNA targets. A smallRNA-Seq dataset generated from brain samples of Huntington's Disease patients was used here to illustrate iSmaRT performances, demonstrating how the pipeline can provide, in a rapid and user friendly way, a comprehensive analysis of different classes of sncRNAs. iSmaRT is freely available on the web at ftp://labmedmolge-1.unisa.it (User: iSmart - Password: password). aweisz@unisa.it or ggiurato@unisa.it. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  5. A decade of Web Server updates at the Bioinformatics Links Directory: 2003-2012.

    Science.gov (United States)

    Brazas, Michelle D; Yim, David; Yeung, Winston; Ouellette, B F Francis

    2012-07-01

    The 2012 Bioinformatics Links Directory update marks the 10th special Web Server issue from Nucleic Acids Research. Beginning with content from their 2003 publication, the Bioinformatics Links Directory in collaboration with Nucleic Acids Research has compiled and published a comprehensive list of freely accessible, online tools, databases and resource materials for the bioinformatics and life science research communities. The past decade has exhibited significant growth and change in the types of tools, databases and resources being put forth, reflecting both technology changes and the nature of research over that time. With the addition of 90 web server tools and 12 updates from the July 2012 Web Server issue of Nucleic Acids Research, the Bioinformatics Links Directory at http://bioinformatics.ca/links_directory/ now contains an impressive 134 resources, 455 databases and 1205 web server tools, mirroring the continued activity and efforts of our field.

  6. Crowdsourcing for bioinformatics.

    Science.gov (United States)

    Good, Benjamin M; Su, Andrew I

    2013-08-15

    Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume 'microtasks' and systems for solving high-difficulty 'megatasks'. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches.

  7. Bioinformatics approaches to single-cell analysis in developmental biology.

    Science.gov (United States)

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. © The Author 2015. Published by Oxford University Press on behalf of the European

  8. Is there room for ethics within bioinformatics education?

    Science.gov (United States)

    Taneri, Bahar

    2011-07-01

    When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.

  9. A bioinformatics roadmap for the human vaccines project.

    Science.gov (United States)

    Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

    2017-06-01

    Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.

  10. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    DEFF Research Database (Denmark)

    Boomsma, Wouter Krogh; Nielsen, Sofie Vincents; Lindorff-Larsen, Kresten

    2016-01-01

    conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology...

  11. BioRuby: bioinformatics software for the Ruby programming language.

    Science.gov (United States)

    Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki

    2010-10-15

    The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org

  12. Bioinformatics Analysis Reveals Genes Involved in the Pathogenesis of Ameloblastoma and Keratocystic Odontogenic Tumor.

    Science.gov (United States)

    Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição

    2016-01-01

    Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (Preview data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.

  13. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

  14. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    Science.gov (United States)

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  15. Bioinformatic Analysis of Strawberry GSTF12 Gene

    Science.gov (United States)

    Wang, Xiran; Jiang, Leiyu; Tang, Haoru

    2018-01-01

    GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.

  16. Planning bioinformatics workflows using an expert system

    Science.gov (United States)

    Chen, Xiaoling; Chang, Jeffrey T.

    2017-01-01

    Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928

  17. A bioinformatics potpourri.

    Science.gov (United States)

    Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba

    2018-01-19

    The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.

  18. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  19. Bioinformatics of cardiovascular miRNA biology.

    Science.gov (United States)

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

    Science.gov (United States)

    Oulas, Anastasis; Minadakis, George; Zachariou, Margarita; Sokratous, Kleitos; Bourdakou, Marilena M; Spyrou, George M

    2017-11-27

    Systems Bioinformatics is a relatively new approach, which lies in the intersection of systems biology and classical bioinformatics. It focuses on integrating information across different levels using a bottom-up approach as in systems biology with a data-driven top-down approach as in bioinformatics. The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system's individual components. A key approach in Systems Bioinformatics is the construction of multiple networks representing each level of the omics spectrum and their integration in a layered network that exchanges information within and between layers. Here, we provide evidence on how Systems Bioinformatics enhances computational therapeutics and diagnostics, hence paving the way to precision medicine. The aim of this review is to familiarize the reader with the emerging field of Systems Bioinformatics and to provide a comprehensive overview of its current state-of-the-art methods and technologies. Moreover, we provide examples of success stories and case studies that utilize such methods and tools to significantly advance research in the fields of systems biology and systems medicine. © The Author 2017. Published by Oxford University Press.

  1. Bioinformatics Methods for Interpreting Toxicogenomics Data: The Role of Text-Mining

    NARCIS (Netherlands)

    Hettne, K.M.; Kleinjans, J.; Stierum, R.H.; Boorsma, A.; Kors, J.A.

    2014-01-01

    This chapter concerns the application of bioinformatics methods to the analysis of toxicogenomics data. The chapter starts with an introduction covering how bioinformatics has been applied in toxicogenomics data analysis, and continues with a description of the foundations of a specific

  2. ISEV position paper: extracellular vesicle RNA analysis and bioinformatics

    Directory of Open Access Journals (Sweden)

    Andrew F. Hill

    2013-12-01

    Full Text Available Extracellular vesicles (EVs are the collective term for the various vesicles that are released by cells into the extracellular space. Such vesicles include exosomes and microvesicles, which vary by their size and/or protein and genetic cargo. With the discovery that EVs contain genetic material in the form of RNA (evRNA has come the increased interest in these vesicles for their potential use as sources of disease biomarkers and potential therapeutic agents. Rapid developments in the availability of deep sequencing technologies have enabled the study of EV-related RNA in detail. In October 2012, the International Society for Extracellular Vesicles (ISEV held a workshop on “evRNA analysis and bioinformatics.” Here, we report the conclusions of one of the roundtable discussions where we discussed evRNA analysis technologies and provide some guidelines to researchers in the field to consider when performing such analysis.

  3. G-DOC Plus - an integrative bioinformatics platform for precision medicine.

    Science.gov (United States)

    Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha

    2016-04-30

    G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available

  4. Video Bioinformatics Analysis of Human Embryonic Stem Cell Colony Growth

    Science.gov (United States)

    Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue

    2010-01-01

    Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion. PMID:20495527

  5. Data mining for bioinformatics applications

    CERN Document Server

    Zengyou, He

    2015-01-01

    Data Mining for Bioinformatics Applications provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems, including problem definition, data collection, data preprocessing, modeling, and validation. The text uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems, containing 45 bioinformatics problems that have been investigated in recent research. For each example, the entire data mining process is described, ranging from data preprocessing to modeling and result validation. Provides valuable information on the data mining methods have been widely used for solving real bioinformatics problems Uses an example-based method to illustrate how to apply data mining techniques to solve real bioinformatics problems Contains 45 bioinformatics problems that have been investigated in recent research.

  6. PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

    Science.gov (United States)

    Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

    2017-11-01

    High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.

  7. Applying Instructional Design Theories to Bioinformatics Education in Microarray Analysis and Primer Design Workshops

    Science.gov (United States)

    Shachak, Aviv; Ophir, Ron; Rubin, Eitan

    2005-01-01

    The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…

  8. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  9. Agonist Binding to Chemosensory Receptors: A Systematic Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Fabrizio Fierro

    2017-09-01

    Full Text Available Human G-protein coupled receptors (hGPCRs constitute a large and highly pharmaceutically relevant membrane receptor superfamily. About half of the hGPCRs' family members are chemosensory receptors, involved in bitter taste and olfaction, along with a variety of other physiological processes. Hence these receptors constitute promising targets for pharmaceutical intervention. Molecular modeling has been so far the most important tool to get insights on agonist binding and receptor activation. Here we investigate both aspects by bioinformatics-based predictions across all bitter taste and odorant receptors for which site-directed mutagenesis data are available. First, we observe that state-of-the-art homology modeling combined with previously used docking procedures turned out to reproduce only a limited fraction of ligand/receptor interactions inferred by experiments. This is most probably caused by the low sequence identity with available structural templates, which limits the accuracy of the protein model and in particular of the side-chains' orientations. Methods which transcend the limited sampling of the conformational space of docking may improve the predictions. As an example corroborating this, we review here multi-scale simulations from our lab and show that, for the three complexes studied so far, they significantly enhance the predictive power of the computational approach. Second, our bioinformatics analysis provides support to previous claims that several residues, including those at positions 1.50, 2.50, and 7.52, are involved in receptor activation.

  10. Bioinformatics and its application in animal health: a review | Soetan ...

    African Journals Online (AJOL)

    Bioinformatics is an interdisciplinary subject, which uses computer application, statistics, mathematics and engineering for the analysis and management of biological information. It has become an important tool for basic and applied research in veterinary sciences. Bioinformatics has brought about advancements into ...

  11. An Overview of Bioinformatics Tools and Resources in Allergy.

    Science.gov (United States)

    Fu, Zhiyan; Lin, Jing

    2017-01-01

    The rapidly increasing number of characterized allergens has created huge demands for advanced information storage, retrieval, and analysis. Bioinformatics and machine learning approaches provide useful tools for the study of allergens and epitopes prediction, which greatly complement traditional laboratory techniques. The specific applications mainly include identification of B- and T-cell epitopes, and assessment of allergenicity and cross-reactivity. In order to facilitate the work of clinical and basic researchers who are not familiar with bioinformatics, we review in this chapter the most important databases, bioinformatic tools, and methods with relevance to the study of allergens.

  12. Bioinformatics and the Undergraduate Curriculum

    Science.gov (United States)

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  13. Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

    Science.gov (United States)

    Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

    2016-08-01

    Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.

  14. DNA mimic proteins: functions, structures, and bioinformatic analysis.

    Science.gov (United States)

    Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J

    2014-05-13

    DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.

  15. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    Science.gov (United States)

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  16. Emerging strengths in Asia Pacific bioinformatics.

    Science.gov (United States)

    Ranganathan, Shoba; Hsu, Wen-Lian; Yang, Ueng-Cheng; Tan, Tin Wee

    2008-12-12

    The 2008 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998, was organized as the 7th International Conference on Bioinformatics (InCoB), jointly with the Bioinformatics and Systems Biology in Taiwan (BIT 2008) Conference, Oct. 20-23, 2008 at Taipei, Taiwan. Besides bringing together scientists from the field of bioinformatics in this region, InCoB is actively involving researchers from the area of systems biology, to facilitate greater synergy between these two groups. Marking the 10th Anniversary of APBioNet, this InCoB 2008 meeting followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India) and Hong Kong. Additionally, tutorials and the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) immediately prior to the 20th Federation of Asian and Oceanian Biochemists and Molecular Biologists (FAOBMB) Taipei Conference provided ample opportunity for inducting mainstream biochemists and molecular biologists from the region into a greater level of awareness of the importance of bioinformatics in their craft. In this editorial, we provide a brief overview of the peer-reviewed manuscripts accepted for publication herein, grouped into thematic areas. As the regional research expertise in bioinformatics matures, the papers fall into thematic areas, illustrating the specific contributions made by APBioNet to global bioinformatics efforts.

  17. Application of Bioinformatics in Chronobiology Research

    Directory of Open Access Journals (Sweden)

    Robson da Silva Lopes

    2013-01-01

    Full Text Available Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.

  18. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    NARCIS (Netherlands)

    L. Shen (Lishuang); M.A. Diroma (Maria Angela); M. Gonzalez (Michael); D. Navarro-Gomez (Daniel); J. Leipzig (Jeremy); M.T. Lott (Marie T.); M. van Oven (Mannis); D.C. Wallace; C.C. Muraresku (Colleen Clarke); Z. Zolkipli-Cunningham (Zarazuela); P.F. Chinnery (Patrick); M. Attimonelli (Marcella); S. Zuchner (Stephan); M.J. Falk (Marni J.); X. Gai (Xiaowu)

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes,

  19. Biggest challenges in bioinformatics.

    Science.gov (United States)

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-04-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held on 18th October 2012, at Heidelberg University, Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the 'Biggest Challenges in Bioinformatics' in a 'World Café' style event.

  20. Biggest challenges in bioinformatics

    OpenAIRE

    Fuller, Jonathan C; Khoueiry, Pierre; Dinkel, Holger; Forslund, Kristoffer; Stamatakis, Alexandros; Barry, Joseph; Budd, Aidan; Soldatos, Theodoros G; Linssen, Katja; Rajput, Abdul Mateen

    2013-01-01

    The third Heidelberg Unseminars in Bioinformatics (HUB) was held in October at Heidelberg University in Germany. HUB brought together around 40 bioinformaticians from academia and industry to discuss the ‘Biggest Challenges in Bioinformatics' in a ‘World Café' style event.

  1. Establishing bioinformatics research in the Asia Pacific

    Directory of Open Access Journals (Sweden)

    Tammi Martti

    2006-12-01

    Full Text Available Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet, Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-Pacific Bioinformatics Network, on Dec. 18–20, 2006 in New Delhi, India, following a series of successful events in Bangkok (Thailand, Penang (Malaysia, Auckland (New Zealand and Busan (South Korea. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. It exemplifies a typical snapshot of the growing research excellence in bioinformatics of the region as we embark on a trajectory of establishing a solid bioinformatics research culture in the Asia Pacific that is able to contribute fully to the global bioinformatics community.

  2. Preface to Introduction to Structural Bioinformatics

    NARCIS (Netherlands)

    Feenstra, K. Anton; Abeln, Sanne

    2018-01-01

    While many good textbooks are available on Protein Structure, Molecular Simulations, Thermodynamics and Bioinformatics methods in general, there is no good introductory level book for the field of Structural Bioinformatics. This book aims to give an introduction into Structural Bioinformatics, which

  3. Expression and Bioinformatic Analysis of Ornithine Aminotransferase 
in Non-small Cell Lung Cancer

    Directory of Open Access Journals (Sweden)

    Danfei ZHOU

    2012-09-01

    Full Text Available Background and objective It has been proven that ornithine aminotransferase (OAT might play an important role in the oncogenesis and progression of numerous malignant tumors. The aim of this study is to detect the mRNA and protein expression of OAT in non-small cell lung cancer (NSCLC, as well as to analyze the bioinformatic features and binary interactions. Methods OAT mRNA expression was detected in A549 and 16HBE cell lines by reverse transcription-polymerase chain reaction. OAT protein expression was determined in 55 cases of NSCLC and 17 cases of adjacent non-tumor lung tissues by immunohistochemical staining. The bioinformatic features and binary interactions of OAT were analyzed. Gene ontology annotation and signal pathway analysis were performed. Results OAT mRNA expression in A549 cells was 2.85-fold lower than that in 16HBE cells. OAT protein expression was significantly higher in NSCLC tissues than that in adjacent non-tumor lung tissues. A significant difference of OAT protein expression was existed between squamous cell lung cancer and adenocarcinoma (P<0.05, but was not correlated with the gender, age, lymph node metastasis, tumor size, and TNM stages. Bioinformatic analysis suggested that OAT was a highly homologous and stable protein located in the mitochondria. An aminotran-3 domain and several sites of phosphorylation, which may function in signal transduction, gene transcription, and molecular transit, were found. In the 54 selected binary interactions of OAT, TNF and TRAF6 play roles in the NF-κB pathway. Conclusion OAT may play an important role in the oncogenesis and progression of NSCLC. Thus, OAT may be a novel biomarker for the diagnosis of NSCLC or a new target for its treatment.

  4. 9th International Conference on Practical Applications of Computational Biology and Bioinformatics

    CERN Document Server

    Rocha, Miguel; Fdez-Riverola, Florentino; Paz, Juan

    2015-01-01

    This proceedings presents recent practical applications of Computational Biology and  Bioinformatics. It contains the proceedings of the 9th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis o...

  5. A Bioinformatics Facility for NASA

    Science.gov (United States)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  6. Computational biology and bioinformatics in Nigeria.

    Science.gov (United States)

    Fatumo, Segun A; Adoga, Moses P; Ojo, Opeolu O; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-04-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  7. Computational biology and bioinformatics in Nigeria.

    Directory of Open Access Journals (Sweden)

    Segun A Fatumo

    2014-04-01

    Full Text Available Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries.

  8. Establishing bioinformatics research in the Asia Pacific

    OpenAIRE

    Ranganathan, Shoba; Tammi, Martti; Gribskov, Michael; Tan, Tin Wee

    2006-01-01

    Abstract In 1998, the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation was set up to champion the advancement of bioinformatics in the Asia Pacific. By 2002, APBioNet was able to gain sufficient critical mass to initiate the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2006 Conference was organized as the 5th annual conference of the Asia-...

  9. PATRIC, the bacterial bioinformatics database and analysis resource.

    Science.gov (United States)

    Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.

  10. Bioinformatics Database Tools in Analysis of Genetics of Neurodevelopmental Disorders

    Directory of Open Access Journals (Sweden)

    Dibyashree Mallik

    2017-10-01

    Full Text Available Bioinformatics tools are recently used in various sectors of biology. Many questions regarding Neurodevelopmental disorder which arises as a major health issue recently can be solved by using various bioinformatics databases. Schizophrenia is such a mental disorder which is now arises as a major threat in young age people because it is mostly seen in case of people during their late adolescence or early adulthood period. Databases like DISGENET, GWAS, PHARMGKB, and DRUGBANK have huge repository of genes associated with schizophrenia. We found a lot of genes are being associated with schizophrenia, but approximately 200 genes are found to be present in any of these databases. After further screening out process 20 genes are found to be highly associated with each other and are also a common genes in many other diseases also. It is also found that they all are serves as a common targeting gene in many antipsychotic drugs. After analysis of various biological properties, molecular function it is found that these 20 genes are mostly involved in biological regulation process and are having receptor activity. They are belonging mainly to receptor protein class. Among these 20 genes CYP2C9, CYP3A4, DRD2, HTR1A, HTR2A are shown to be a main targeting genes of most of the antipsychotic drugs and are associated with  more than 40% diseases. The basic findings of the present study enumerated that a suitable combined drug can be design by targeting these genes which can be used for the better treatment of schizophrenia.

  11. Bioinformatics and moonlighting proteins

    Directory of Open Access Journals (Sweden)

    Sergio eHernández

    2015-06-01

    Full Text Available Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a remote homology searches using Psi-Blast, b detection of functional motifs and domains, c analysis of data from protein-protein interaction databases (PPIs, d match the query protein sequence to 3D databases (i.e., algorithms as PISITE, e mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/, previously published by our group, has been used as a benchmark for the all of the analyses.

  12. Analyzing the field of bioinformatics with the multi-faceted topic modeling technique.

    Science.gov (United States)

    Heo, Go Eun; Kang, Keun Young; Song, Min; Lee, Jeong-Hoon

    2017-05-31

    Bioinformatics is an interdisciplinary field at the intersection of molecular biology and computing technology. To characterize the field as convergent domain, researchers have used bibliometrics, augmented with text-mining techniques for content analysis. In previous studies, Latent Dirichlet Allocation (LDA) was the most representative topic modeling technique for identifying topic structure of subject areas. However, as opposed to revealing the topic structure in relation to metadata such as authors, publication date, and journals, LDA only displays the simple topic structure. In this paper, we adopt the Tang et al.'s Author-Conference-Topic (ACT) model to study the field of bioinformatics from the perspective of keyphrases, authors, and journals. The ACT model is capable of incorporating the paper, author, and conference into the topic distribution simultaneously. To obtain more meaningful results, we use journals and keyphrases instead of conferences and bag-of-words.. For analysis, we use PubMed to collected forty-six bioinformatics journals from the MEDLINE database. We conducted time series topic analysis over four periods from 1996 to 2015 to further examine the interdisciplinary nature of bioinformatics. We analyze the ACT Model results in each period. Additionally, for further integrated analysis, we conduct a time series analysis among the top-ranked keyphrases, journals, and authors according to their frequency. We also examine the patterns in the top journals by simultaneously identifying the topical probability in each period, as well as the top authors and keyphrases. The results indicate that in recent years diversified topics have become more prevalent and convergent topics have become more clearly represented. The results of our analysis implies that overtime the field of bioinformatics becomes more interdisciplinary where there is a steady increase in peripheral fields such as conceptual, mathematical, and system biology. These results are

  13. A bioinformatics prediction approach towards analyzing the glycosylation, co-expression and interaction patterns of epithelial membrane antigen (EMA/MUC1)

    International Nuclear Information System (INIS)

    Kalra, Rajkumar S.; Wadhwa, Renu

    2015-01-01

    Epithelial membrane antigen (EMA or MUC1) is a heavily glycosylated, type I transmembrane glycoprotein commonly expressed by epithelial cells of duct organs. It has been shown to be aberrantly glycosylated in several diseases including cancer. Protein sequence based annotation and analysis of glycosylation profile of glycoproteins by robust computational and comprehensive algorithms provides possible insights to the mechanism(s) of anomalous glycosylation. In present report, by using a number of bioinformatics applications we studied EMA/MUC1 and explored its trans-membrane structural domain sequence that is widely subjected to glycosylation. Exploration of different extracellular motifs led to prediction of N and O-linked glycosylation target sites. Based on the putative O-linked target sites, glycosylated moieties and pathways were envisaged. Furthermore, Protein network analysis demonstrated physical interaction of EMA with a number of proteins and confirmed its functional involvement in cell growth and proliferation pathways. Gene Ontology analysis suggested an involvement of EMA in a number of functions including signal transduction, protein binding, processing and transport along with glycosylation. Thus, present study explored potential of bioinformatics prediction approach in analyzing glycosylation, co-expression and interaction patterns of EMA/MUC1 glycoprotein

  14. A bioinformatics prediction approach towards analyzing the glycosylation, co-expression and interaction patterns of epithelial membrane antigen (EMA/MUC1)

    Energy Technology Data Exchange (ETDEWEB)

    Kalra, Rajkumar S., E-mail: renu-wadhwa@aist.go.jp; Wadhwa, Renu, E-mail: renu-wadhwa@aist.go.jp [Cell Proliferation Research Group and DBT-AIST International Laboratory for Advanced Biomedicine, National Institute of Advanced Industrial Science and Technology (AIST Central 4), 1-1-1 Higashi, Tsukuba, Ibaraki 305-8562 (Japan)

    2015-02-27

    Epithelial membrane antigen (EMA or MUC1) is a heavily glycosylated, type I transmembrane glycoprotein commonly expressed by epithelial cells of duct organs. It has been shown to be aberrantly glycosylated in several diseases including cancer. Protein sequence based annotation and analysis of glycosylation profile of glycoproteins by robust computational and comprehensive algorithms provides possible insights to the mechanism(s) of anomalous glycosylation. In present report, by using a number of bioinformatics applications we studied EMA/MUC1 and explored its trans-membrane structural domain sequence that is widely subjected to glycosylation. Exploration of different extracellular motifs led to prediction of N and O-linked glycosylation target sites. Based on the putative O-linked target sites, glycosylated moieties and pathways were envisaged. Furthermore, Protein network analysis demonstrated physical interaction of EMA with a number of proteins and confirmed its functional involvement in cell growth and proliferation pathways. Gene Ontology analysis suggested an involvement of EMA in a number of functions including signal transduction, protein binding, processing and transport along with glycosylation. Thus, present study explored potential of bioinformatics prediction approach in analyzing glycosylation, co-expression and interaction patterns of EMA/MUC1 glycoprotein.

  15. Transcriptomic and bioinformatics analysis of the early time-course of the response to prostaglandin F2 alpha in the bovine corpus luteum

    Directory of Open Access Journals (Sweden)

    Heather Talbott

    2017-10-01

    Full Text Available RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069. Subsequent statistical analysis determined differentially expressed transcripts ± 1.5-fold change from saline control with P ≤ 0.05. Gene ontology of differentially expressed transcripts was annotated by DAVID and Panther. Physiological characteristics of the study animals are presented in a figure. Bioinformatic analysis by Ingenuity Pathway Analysis was curated, compiled, and presented in tables. A dataset comparison with similar microarray analyses was performed and bioinformatics analysis by Ingenuity Pathway Analysis, DAVID, Panther, and String of differentially expressed genes from each dataset as well as the differentially expressed genes common to all three datasets were curated, compiled, and presented in tables. Finally, a table comparing four bioinformatics tools’ predictions of functions associated with genes common to all three datasets is presented. These data have been further analyzed and interpreted in the companion article “Early transcriptome responses of the bovine mid-cycle corpus luteum to prostaglandin F2 alpha includes cytokine signaling” [1].

  16. Bioinformatics programs are 31-fold over-represented among the highest impact scientific papers of the past two decades.

    Science.gov (United States)

    Wren, Jonathan D

    2016-09-01

    To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P papers, bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Bioinformatic analysis of Msx1 and Msx2 involved in craniofacial development.

    Science.gov (United States)

    Dai, Jiewen; Mou, Zhifang; Shen, Shunyao; Dong, Yuefu; Yang, Tong; Shen, Steve Guofang

    2014-01-01

    Msx1 and Msx2 were revealed to be candidate genes for some craniofacial deformities, such as cleft lip with/without cleft palate (CL/P) and craniosynostosis. Many other genes were demonstrated to have a cross-talk with MSX genes in causing these defects. However, there is no systematic evaluation for these MSX gene-related factors. In this study, we performed systematic bioinformatic analysis for MSX genes by combining using GeneDecks, DAVID, and STRING database, and the results showed that there were numerous genes related to MSX genes, such as Irf6, TP63, Dlx2, Dlx5, Pax3, Pax9, Bmp4, Tgf-beta2, and Tgf-beta3 that have been demonstrated to be involved in CL/P, and Fgfr2, Fgfr1, Fgfr3, and Twist1 that were involved in craniosynostosis. Many of these genes could be enriched into different gene groups involved in different signaling ways, different craniofacial deformities, and different biological process. These findings could make us analyze the function of MSX gens in a gene network. In addition, our findings showed that Sumo, a novel gene whose polymorphisms were demonstrated to be associated with nonsyndromic CL/P by genome-wide association study, has protein-protein interaction with MSX1, which may offer us an alternative method to perform bioinformatic analysis for genes found by genome-wide association study and can make us predict the disrupted protein function due to the mutation in a gene DNA sequence. These findings may guide us to perform further functional studies in the future.

  18. Bioinformatics and systems biology research update from the 15th International Conference on Bioinformatics (InCoB2016).

    Science.gov (United States)

    Schönbach, Christian; Verma, Chandra; Bond, Peter J; Ranganathan, Shoba

    2016-12-22

    The International Conference on Bioinformatics (InCoB) has been publishing peer-reviewed conference papers in BMC Bioinformatics since 2006. Of the 44 articles accepted for publication in supplement issues of BMC Bioinformatics, BMC Genomics, BMC Medical Genomics and BMC Systems Biology, 24 articles with a bioinformatics or systems biology focus are reviewed in this editorial. InCoB2017 is scheduled to be held in Shenzen, China, September 20-22, 2017.

  19. BioWarehouse: a bioinformatics database warehouse toolkit.

    Science.gov (United States)

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

    2006-03-23

    This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.

  20. Designing XML schemas for bioinformatics.

    Science.gov (United States)

    Bruhn, Russel Elton; Burton, Philip John

    2003-06-01

    Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

  1. Web-based bioinformatics workflows for end-to-end RNA-seq data computation and analysis in agricultural animal species

    Science.gov (United States)

    Remarkable advances in next-generation sequencing (NGS) technologies, bioinformatics algorithms, and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS fi...

  2. Galaxy Workflows for Web-based Bioinformatics Analysis of Aptamer High-throughput Sequencing Data

    Directory of Open Access Journals (Sweden)

    William H Thiel

    2016-01-01

    Full Text Available Development of RNA and DNA aptamers for diagnostic and therapeutic applications is a rapidly growing field. Aptamers are identified through iterative rounds of selection in a process termed SELEX (Systematic Evolution of Ligands by EXponential enrichment. High-throughput sequencing (HTS revolutionized the modern SELEX process by identifying millions of aptamer sequences across multiple rounds of aptamer selection. However, these vast aptamer HTS datasets necessitated bioinformatics techniques. Herein, we describe a semiautomated approach to analyze aptamer HTS datasets using the Galaxy Project, a web-based open source collection of bioinformatics tools that were originally developed to analyze genome, exome, and transcriptome HTS data. Using a series of Workflows created in the Galaxy webserver, we demonstrate efficient processing of aptamer HTS data and compilation of a database of unique aptamer sequences. Additional Workflows were created to characterize the abundance and persistence of aptamer sequences within a selection and to filter sequences based on these parameters. A key advantage of this approach is that the online nature of the Galaxy webserver and its graphical interface allow for the analysis of HTS data without the need to compile code or install multiple programs.

  3. In silico cloning and bioinformatic analysis of PEPCK gene in ...

    African Journals Online (AJOL)

    Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. According to the relative conservation of homologous gene, a bioinformatics strategy was applied to clone Fusarium ...

  4. Bioinformatics-Aided Venomics

    Directory of Open Access Journals (Sweden)

    Quentin Kaas

    2015-06-01

    Full Text Available Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future.

  5. Bioinformatics analysis of breast cancer bone metastasis related gene-CXCR4

    Institute of Scientific and Technical Information of China (English)

    Heng-Wei Zhang; Xian-Fu Sun; Ya-Ning He; Jun-Tao Li; Xu-Hui Guo; Hui Liu

    2013-01-01

    Objective: To analyze breast cancer bone metastasis related gene-CXCR4. Methods: This research screened breast cancer bone metastasis related genes by high-flux gene chip. Results:It was found that the expressions of 396 genes were different including 165 up-regulations and 231 down-regulations. The expression of chemokine receptor CXCR4 was obviously up-regulated in the tissue with breast cancer bone metastasis. Compared with the tissue without bone metastasis, there was significant difference, which indicated that CXCR4 played a vital role in breast cancer bone metastasis. Conclusions: The bioinformatics analysis of CXCR4 can provide a certain basis for the occurrence and diagnosis of breast cancer bone metastasis, target gene therapy and evaluation of prognosis.

  6. Statistical modelling in biostatistics and bioinformatics selected papers

    CERN Document Server

    Peng, Defen

    2014-01-01

    This book presents selected papers on statistical model development related mainly to the fields of Biostatistics and Bioinformatics. The coverage of the material falls squarely into the following categories: (a) Survival analysis and multivariate survival analysis, (b) Time series and longitudinal data analysis, (c) Statistical model development and (d) Applied statistical modelling. Innovations in statistical modelling are presented throughout each of the four areas, with some intriguing new ideas on hierarchical generalized non-linear models and on frailty models with structural dispersion, just to mention two examples. The contributors include distinguished international statisticians such as Philip Hougaard, John Hinde, Il Do Ha, Roger Payne and Alessandra Durio, among others, as well as promising newcomers. Some of the contributions have come from researchers working in the BIO-SI research programme on Biostatistics and Bioinformatics, centred on the Universities of Limerick and Galway in Ireland and fu...

  7. Interdisciplinary Introductory Course in Bioinformatics

    Science.gov (United States)

    Kortsarts, Yana; Morris, Robert W.; Utell, Janine M.

    2010-01-01

    Bioinformatics is a relatively new interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information. We present our experience in teaching an interdisciplinary course, Introduction to Bioinformatics, which was developed…

  8. PATRIC, the bacterial bioinformatics database and analysis resource

    Science.gov (United States)

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  9. BioWarehouse: a bioinformatics database warehouse toolkit

    Directory of Open Access Journals (Sweden)

    Stringer-Calvert David WJ

    2006-03-01

    Full Text Available Abstract Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the

  10. Green Fluorescent Protein-Focused Bioinformatics Laboratory Experiment Suitable for Undergraduates in Biochemistry Courses

    Science.gov (United States)

    Rowe, Laura

    2017-01-01

    An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…

  11. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs.

  12. The Cytotoxicity Mechanism of 6-Shogaol-Treated HeLa Human Cervical Cancer Cells Revealed by Label-Free Shotgun Proteomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Qun Liu

    2012-01-01

    Full Text Available Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale. In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism.

  13. Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis

    Directory of Open Access Journals (Sweden)

    Yang X

    2018-03-01

    Full Text Available Xiao Yang,1 Shaoming Zhu,2 Li Li,3 Li Zhang,1 Shu Xian,1 Yanqing Wang,1 Yanxiang Cheng1 1Department of Obstetrics and Gynecology, 2Department of Urology, Renmin Hospital of Wuhan University, 3Department of Pharmacology, Wuhan University Health Science Center, Wuhan, Hubei, People’s Republic of China Background: The mortality rate associated with ovarian cancer ranks the highest among gynecological malignancies. However, the cause and underlying molecular events of ovarian cancer are not clear. Here, we applied integrated bioinformatics to identify key pathogenic genes involved in ovarian cancer and reveal potential molecular mechanisms. Results: The expression profiles of GDS3592, GSE54388, and GSE66957 were downloaded from the Gene Expression Omnibus (GEO database, which contained 115 samples, including 85 cases of ovarian cancer samples and 30 cases of normal ovarian samples. The three microarray datasets were integrated to obtain differentially expressed genes (DEGs and were deeply analyzed by bioinformatics methods. The gene ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway enrichments of DEGs were performed by DAVID and KOBAS online analyses, respectively. The protein–protein interaction (PPI networks of the DEGs were constructed from the STRING database. A total of 190 DEGs were identified in the three GEO datasets, of which 99 genes were upregulated and 91 genes were downregulated. GO analysis showed that the biological functions of DEGs focused primarily on regulating cell proliferation, adhesion, and differentiation and intracellular signal cascades. The main cellular components include cell membranes, exosomes, the cytoskeleton, and the extracellular matrix. The molecular functions include growth factor activity, protein kinase regulation, DNA binding, and oxygen transport activity. KEGG pathway analysis showed that these DEGs were mainly involved in the Wnt signaling pathway, amino acid metabolism, and the

  14. Rising Strengths Hong Kong SAR in Bioinformatics.

    Science.gov (United States)

    Chakraborty, Chiranjib; George Priya Doss, C; Zhu, Hailong; Agoramoorthy, Govindasamy

    2017-06-01

    Hong Kong's bioinformatics sector is attaining new heights in combination with its economic boom and the predominance of the working-age group in its population. Factors such as a knowledge-based and free-market economy have contributed towards a prominent position on the world map of bioinformatics. In this review, we have considered the educational measures, landmark research activities and the achievements of bioinformatics companies and the role of the Hong Kong government in the establishment of bioinformatics as strength. However, several hurdles remain. New government policies will assist computational biologists to overcome these hurdles and further raise the profile of the field. There is a high expectation that bioinformatics in Hong Kong will be a promising area for the next generation.

  15. EURASIP journal on bioinformatics & systems biology

    National Research Council Canada - National Science Library

    2006-01-01

    "The overall aim of "EURASIP Journal on Bioinformatics and Systems Biology" is to publish research results related to signal processing and bioinformatics theories and techniques relevant to a wide...

  16. Aligning experimental design with bioinformatics analysis to meet discovery research objectives.

    Science.gov (United States)

    Kane, Michael D

    2002-01-01

    The utility of genomic technology and bioinformatic analytical support to provide new and needed insight into the molecular basis of disease, development, and diversity continues to grow as more research model systems and populations are investigated. Yet deriving results that meet a specific set of research objectives requires aligning or coordinating the design of the experiment, the laboratory techniques, and the data analysis. The following paragraphs describe several important interdependent factors that need to be considered to generate high quality data from the microarray platform. These factors include aligning oligonucleotide probe design with the sample labeling strategy if oligonucleotide probes are employed, recognizing that compromises are inherent in different sample procurement methods, normalizing 2-color microarray raw data, and distinguishing the difference between gene clustering and sample clustering. These factors do not represent an exhaustive list of technical variables in microarray-based research, but this list highlights those variables that span both experimental execution and data analysis. Copyright 2001 Wiley-Liss, Inc.

  17. Identification of new serum markers of pathological states by bioinformatic tools for the analysis of serum proteomics expression profiles

    International Nuclear Information System (INIS)

    Malorni, A.; Facchiano, A.

    2009-01-01

    We have developed new bioinformatic tools and strategies, aimed to the identification and characterization of proteins as markers of pathological states, for the analysis of data derived from protein expression profiles obtained by mass spectrometry techniques, for the study of structural and functional properties of the proteins, and for the analysis of data from omics approaches

  18. Virtual Bioinformatics Distance Learning Suite

    Science.gov (United States)

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  19. Modern bioinformatics meets traditional Chinese medicine.

    Science.gov (United States)

    Gu, Peiqin; Chen, Huajun

    2014-11-01

    Traditional Chinese medicine (TCM) is gaining increasing attention with the emergence of integrative medicine and personalized medicine, characterized by pattern differentiation on individual variance and treatments based on natural herbal synergism. Investigating the effectiveness and safety of the potential mechanisms of TCM and the combination principles of drug therapies will bridge the cultural gap with Western medicine and improve the development of integrative medicine. Dealing with rapidly growing amounts of biomedical data and their heterogeneous nature are two important tasks among modern biomedical communities. Bioinformatics, as an emerging interdisciplinary field of computer science and biology, has become a useful tool for easing the data deluge pressure by automating the computation processes with informatics methods. Using these methods to retrieve, store and analyze the biomedical data can effectively reveal the associated knowledge hidden in the data, and thus promote the discovery of integrated information. Recently, these techniques of bioinformatics have been used for facilitating the interactional effects of both Western medicine and TCM. The analysis of TCM data using computational technologies provides biological evidence for the basic understanding of TCM mechanisms, safety and efficacy of TCM treatments. At the same time, the carrier and targets associated with TCM remedies can inspire the rethinking of modern drug development. This review summarizes the significant achievements of applying bioinformatics techniques to many aspects of the research in TCM, such as analysis of TCM-related '-omics' data and techniques for analyzing biological processes and pharmaceutical mechanisms of TCM, which have shown certain potential of bringing new thoughts to both sides. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  20. Systematic Identification and Bioinformatic Analysis of MicroRNAs in Response to Infections of Coxsackievirus A16 and Enterovirus 71.

    Science.gov (United States)

    Zhu, Zheng; Qi, Yuhua; Fan, Huan; Cui, Lunbiao; Shi, Zhiyang

    2016-01-01

    Hand, foot, and mouth disease (HFMD), mainly caused by coxsackievirus A16 (CVA16) and enterovirus 71 (EV71) infections, remains a serious public health issue with thousands of newly diagnostic cases each year since 2008 in China. The mechanisms underlying viral infection, however, are elusive to date. In the present study, we systematically investigated the host cellular microRNA (miRNA) expression patterns in response to CVA16 and EV71 infections. Through microarray examination, 27 miRNAs (15 upregulated and 12 downregulated) were found to be coassociated with the replication process of two viruses, while the expression levels of 15 and 5 miRNAs were significantly changed in CVA16- and EV71-infected cells, respectively. A great number of target genes of 27 common differentially expressed miRNAs were predicted by combined use of two computational target prediction algorithms, TargetScan and MiRanda. Comprehensive bioinformatic analysis of target genes in GO categories and KEGG pathways indicated the involvement of diverse biological functions and signaling pathways during viral infection. These results provide an overview of the roles of miRNAs in virus-host interaction, which will contribute to further understanding of HFMD pathological mechanisms.

  1. The 2016 Bioinformatics Open Source Conference (BOSC).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Chapman, Brad; Fields, Christopher J; Hokamp, Karsten; Lapp, Hilmar; Muñoz-Torres, Monica; Wiencko, Heather

    2016-01-01

    Message from the ISCB: The Bioinformatics Open Source Conference (BOSC) is a yearly meeting organized by the Open Bioinformatics Foundation (OBF), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development and Open Science within the biological research community. BOSC has been run since 2000 as a two-day Special Interest Group (SIG) before the annual ISMB conference. The 17th annual BOSC ( http://www.open-bio.org/wiki/BOSC_2016) took place in Orlando, Florida in July 2016. As in previous years, the conference was preceded by a two-day collaborative coding event open to the bioinformatics community. The conference brought together nearly 100 bioinformatics researchers, developers and users of open source software to interact and share ideas about standards, bioinformatics software development, and open and reproducible science.

  2. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Wei Li

    2016-04-01

    Full Text Available Mitogen‐activated protein kinase kinase kinase (MAPKKK is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome‐wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high‐throughput sequencing‐data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA‐seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome‐wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula.

  3. An innovative approach for testing bioinformatics programs using metamorphic testing

    Directory of Open Access Journals (Sweden)

    Liu Huai

    2009-01-01

    Full Text Available Abstract Background Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguide downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. Results We propose to use a novel software testing technique, metamorphic testing (MT, to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs, thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. Conclusion This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work

  4. Bringing Web 2.0 to bioinformatics.

    Science.gov (United States)

    Zhang, Zhang; Cheung, Kei-Hoi; Townsend, Jeffrey P

    2009-01-01

    Enabling deft data integration from numerous, voluminous and heterogeneous data sources is a major bioinformatic challenge. Several approaches have been proposed to address this challenge, including data warehousing and federated databasing. Yet despite the rise of these approaches, integration of data from multiple sources remains problematic and toilsome. These two approaches follow a user-to-computer communication model for data exchange, and do not facilitate a broader concept of data sharing or collaboration among users. In this report, we discuss the potential of Web 2.0 technologies to transcend this model and enhance bioinformatics research. We propose a Web 2.0-based Scientific Social Community (SSC) model for the implementation of these technologies. By establishing a social, collective and collaborative platform for data creation, sharing and integration, we promote a web services-based pipeline featuring web services for computer-to-computer data exchange as users add value. This pipeline aims to simplify data integration and creation, to realize automatic analysis, and to facilitate reuse and sharing of data. SSC can foster collaboration and harness collective intelligence to create and discover new knowledge. In addition to its research potential, we also describe its potential role as an e-learning platform in education. We discuss lessons from information technology, predict the next generation of Web (Web 3.0), and describe its potential impact on the future of bioinformatics studies.

  5. Genomics and bioinformatics resources for translational science in Rosaceae.

    Science.gov (United States)

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  6. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

    Science.gov (United States)

    Ison, Jon; Kalaš, Matúš; Jonassen, Inge; Bolser, Dan; Uludag, Mahmut; McWilliam, Hamish; Malone, James; Lopez, Rodrigo; Pettifer, Steve; Rice, Peter

    2013-01-01

    Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: jison@ebi.ac.uk PMID:23479348

  7. Bioinformatics in translational drug discovery.

    Science.gov (United States)

    Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

    2017-08-31

    Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).

  8. New Link in Bioinformatics Services Value Chain: Position, Organization and Business Model

    Directory of Open Access Journals (Sweden)

    Mladen Čudanov

    2012-11-01

    Full Text Available This paper presents development in the bioinformatics services industry value chain, based on cloud computing paradigm. As genome sequencing costs per Megabase exponentially drop, industry needs to adopt. Paper has two parts: theoretical analysis and practical example of Seven Bridges Genomics Company. We are focused on explaining organizational, business and financial aspects of new business model in bioinformatics services, rather than technical side of the problem. In the light of that we present twofold business model fit for core bioinformatics research and Information and Communication Technologie (ICT support in the new environment, with higher level of capital utilization and better resistance to business risks.

  9. Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.

    Science.gov (United States)

    Huang, Xin; Li, Hao-ming

    2009-08-05

    Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.

  10. Navigating the changing learning landscape: perspective from bioinformatics.ca

    OpenAIRE

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable...

  11. A Review of Recent Advances in Translational Bioinformatics: Bridges from Biology to Medicine.

    Science.gov (United States)

    Vamathevan, J; Birney, E

    2017-08-01

    Objectives: To highlight and provide insights into key developments in translational bioinformatics between 2014 and 2016. Methods: This review describes some of the most influential bioinformatics papers and resources that have been published between 2014 and 2016 as well as the national genome sequencing initiatives that utilize these resources to routinely embed genomic medicine into healthcare. Also discussed are some applications of the secondary use of patient data followed by a comprehensive view of the open challenges and emergent technologies. Results: Although data generation can be performed routinely, analyses and data integration methods still require active research and standardization to improve streamlining of clinical interpretation. The secondary use of patient data has resulted in the development of novel algorithms and has enabled a refined understanding of cellular and phenotypic mechanisms. New data storage and data sharing approaches are required to enable diverse biomedical communities to contribute to genomic discovery. Conclusion: The translation of genomics data into actionable knowledge for use in healthcare is transforming the clinical landscape in an unprecedented way. Exciting and innovative models that bridge the gap between clinical and academic research are set to open up the field of translational bioinformatics for rapid growth in a digital era. Georg Thieme Verlag KG Stuttgart.

  12. MEDICAL DIAGNOSTICS BY MICROSTRUCTURAL ANALYSIS OF BIOLOGICAL LIQUID DRIED PATTERNS AS A PROBLEM OF BIOINFORMATICS

    Directory of Open Access Journals (Sweden)

    Petr Vladimirovich Lebedev-Stepanov, Dr.

    2018-02-01

    Full Text Available Motivation: It is important to develop the high-precision computerized methods for medical rapid diagnostic which is generalizing the unique clinical experience obtained in the past decade as specialized solutions for diagnostic problems of control of specific diseases and, potentially, for a wide health monitoring of virtually healthy population, identify the reserves of human health and take the actions to prevent of these reserves depletion. In this work we present one of the new directions in bioinformatics, i.e. medical diagnostics by automated expert system on basis of morphology analysis of digital image of biological liquid dried pattern. Results: Proposed method is combination of bioinformatics and biochemistry approaches for obtaining diagnostic information from a morphological analysis of standardized dried patterns of biological liquid sessile drop. We have carried out own research in collaboration with medical diagnostic centers and formed the electronic database for recognition the following types of diseases: candidiasis; neoplasms; diabetes mellitus; diseases of the circulatory system; cerebrovascular disease; diseases of the digestive system; diseases of the genitourinary system; infectious diseases; factors relevant to the work; factors associated with environmental pollution; factors related to lifestyle. The laboratory setup for diagnostics of the human body in pathology states is developed. The diagnostic results are considered. Availability: Access to testing the software can be obtained on request to the contact email below.

  13. Bioinformatics for Exploration

    Science.gov (United States)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  14. The eBioKit, a stand-alone educational platform for bioinformatics.

    Science.gov (United States)

    Hernández-de-Diego, Rafael; de Villiers, Etienne P; Klingström, Tomas; Gourlé, Hadrien; Conesa, Ana; Bongcam-Rudloff, Erik

    2017-09-01

    Bioinformatics skills have become essential for many research areas; however, the availability of qualified researchers is usually lower than the demand and training to increase the number of able bioinformaticians is an important task for the bioinformatics community. When conducting training or hands-on tutorials, the lack of control over the analysis tools and repositories often results in undesirable situations during training, as unavailable online tools or version conflicts may delay, complicate, or even prevent the successful completion of a training event. The eBioKit is a stand-alone educational platform that hosts numerous tools and databases for bioinformatics research and allows training to take place in a controlled environment. A key advantage of the eBioKit over other existing teaching solutions is that all the required software and databases are locally installed on the system, significantly reducing the dependence on the internet. Furthermore, the architecture of the eBioKit has demonstrated itself to be an excellent balance between portability and performance, not only making the eBioKit an exceptional educational tool but also providing small research groups with a platform to incorporate bioinformatics analysis in their research. As a result, the eBioKit has formed an integral part of training and research performed by a wide variety of universities and organizations such as the Pan African Bioinformatics Network (H3ABioNet) as part of the initiative Human Heredity and Health in Africa (H3Africa), the Southern Africa Network for Biosciences (SAnBio) initiative, the Biosciences eastern and central Africa (BecA) hub, and the International Glossina Genome Initiative.

  15. The GMOD Drupal bioinformatic server framework.

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G

    2010-12-15

    Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.

  16. Analyses of Brucella Pathogenesis, Host Immunity, and Vaccine Targets using Systems Biology and Bioinformatics

    OpenAIRE

    He, Yongqun

    2012-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omi...

  17. A global perspective on evolving bioinformatics and data science training needs.

    Science.gov (United States)

    Attwood, Teresa K; Blackford, Sarah; Brazas, Michelle D; Davies, Angela; Schneider, Maria Victoria

    2017-08-29

    Bioinformatics is now intrinsic to life science research, but the past decade has witnessed a continuing deficiency in this essential expertise. Basic data stewardship is still taught relatively rarely in life science education programmes, creating a chasm between theory and practice, and fuelling demand for bioinformatics training across all educational levels and career roles. Concerned by this, surveys have been conducted in recent years to monitor bioinformatics and computational training needs worldwide. This article briefly reviews the principal findings of a number of these studies. We see that there is still a strong appetite for short courses to improve expertise and confidence in data analysis and interpretation; strikingly, however, the most urgent appeal is for bioinformatics to be woven into the fabric of life science degree programmes. Satisfying the relentless training needs of current and future generations of life scientists will require a concerted response from stakeholders across the globe, who need to deliver sustainable solutions capable of both transforming education curricula and cultivating a new cadre of trainer scientists. © The Author 2017. Published by Oxford University Press.

  18. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  19. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    Science.gov (United States)

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.

  20. mORCA: sailing bioinformatics world with mobile devices.

    Science.gov (United States)

    Díaz-Del-Pino, Sergio; Falgueras, Juan; Perez-Wohlfeil, Esteban; Trelles, Oswaldo

    2018-03-01

    Nearly 10 years have passed since the first mobile apps appeared. Given the fact that bioinformatics is a web-based world and that mobile devices are endowed with web-browsers, it seemed natural that bioinformatics would transit from personal computers to mobile devices but nothing could be further from the truth. The transition demands new paradigms, designs and novel implementations. Throughout an in-depth analysis of requirements of existing bioinformatics applications we designed and deployed an easy-to-use web-based lightweight mobile client. Such client is able to browse, select, compose automatically interface parameters, invoke services and monitor the execution of Web Services using the service's metadata stored in catalogs or repositories. mORCA is available at http://bitlab-es.com/morca/app as a web-app. It is also available in the App store by Apple and Play Store by Google. The software will be available for at least 2 years. ortrelles@uma.es. Source code, final web-app, training material and documentation is available at http://bitlab-es.com/morca. © The Author(s) 2017. Published by Oxford University Press.

  1. p3d--Python module for structural bioinformatics.

    Science.gov (United States)

    Fufezan, Christian; Specht, Michael

    2009-08-21

    High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  2. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra

    2013-06-25

    The mountains of data thrusting from the new landscape of modern high-throughput biology are irrevocably changing biomedical research and creating a near-insatiable demand for training in data management and manipulation and data mining and analysis. Among life scientists, from clinicians to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes. In this context, this article discusses various pragmatic criteria for identifying training needs and learning objectives, for selecting suitable trainees and trainers, for developing and maintaining training skills and evaluating training quality. Adherence to these criteria may help not only to guide course organizers and trainers on the path towards bioinformatics training excellence but, importantly, also to improve the training experience for life scientists.

  3. Component-Based Approach for Educating Students in Bioinformatics

    Science.gov (United States)

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  4. Biowep: a workflow enactment portal for bioinformatics applications.

    Science.gov (United States)

    Romano, Paolo; Bartocci, Ezio; Bertolini, Guglielmo; De Paoli, Flavio; Marra, Domenico; Mauri, Giancarlo; Merelli, Emanuela; Milanesi, Luciano

    2007-03-08

    The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS), can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical databases and analysis software and the creation of

  5. Biowep: a workflow enactment portal for bioinformatics applications

    Directory of Open Access Journals (Sweden)

    Romano Paolo

    2007-03-01

    Full Text Available Abstract Background The huge amount of biological information, its distribution over the Internet and the heterogeneity of available software tools makes the adoption of new data integration and analysis network tools a necessity in bioinformatics. ICT standards and tools, like Web Services and Workflow Management Systems (WMS, can support the creation and deployment of such systems. Many Web Services are already available and some WMS have been proposed. They assume that researchers know which bioinformatics resources can be reached through a programmatic interface and that they are skilled in programming and building workflows. Therefore, they are not viable to the majority of unskilled researchers. A portal enabling these to take profit from new technologies is still missing. Results We designed biowep, a web based client application that allows for the selection and execution of a set of predefined workflows. The system is available on-line. Biowep architecture includes a Workflow Manager, a User Interface and a Workflow Executor. The task of the Workflow Manager is the creation and annotation of workflows. These can be created by using either the Taverna Workbench or BioWMS. Enactment of workflows is carried out by FreeFluo for Taverna workflows and by BioAgent/Hermes, a mobile agent-based middleware, for BioWMS ones. Main workflows' processing steps are annotated on the basis of their input and output, elaboration type and application domain by using a classification of bioinformatics data and tasks. The interface supports users authentication and profiling. Workflows can be selected on the basis of users' profiles and can be searched through their annotations. Results can be saved. Conclusion We developed a web system that support the selection and execution of predefined workflows, thus simplifying access for all researchers. The implementation of Web Services allowing specialized software to interact with an exhaustive set of biomedical

  6. Phylogenetic trees in bioinformatics

    Energy Technology Data Exchange (ETDEWEB)

    Burr, Tom L [Los Alamos National Laboratory

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  7. The GMOD Drupal Bioinformatic Server Framework

    Science.gov (United States)

    Papanicolaou, Alexie; Heckel, David G.

    2010-01-01

    Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988

  8. libcov: A C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny

    OpenAIRE

    Butt, Davin; Roger, Andrew J; Blouin, Christian

    2005-01-01

    Background An increasing number of bioinformatics methods are considering the phylogenetic relationships between biological sequences. Implementing new methodologies using the maximum likelihood phylogenetic framework can be a time consuming task. Results The bioinformatics library libcov is a collection of C++ classes that provides a high and low-level interface to maximum likelihood phylogenetics, sequence analysis and a data structure for structural biological methods. libcov can be used ...

  9. Analysis of microRNA profile of Anopheles sinensis by deep sequencing and bioinformatic approaches.

    Science.gov (United States)

    Feng, Xinyu; Zhou, Xiaojian; Zhou, Shuisen; Wang, Jingwen; Hu, Wei

    2018-03-12

    microRNAs (miRNAs) are small non-coding RNAs widely identified in many mosquitoes. They are reported to play important roles in development, differentiation and innate immunity. However, miRNAs in Anopheles sinensis, one of the Chinese malaria mosquitoes, remain largely unknown. We investigated the global miRNA expression profile of An. sinensis using Illumina Hiseq 2000 sequencing. Meanwhile, we applied a bioinformatic approach to identify potential miRNAs in An. sinensis. The identified miRNA profiles were compared and analyzed by two approaches. The selected miRNAs from the sequencing result and the bioinformatic approach were confirmed with qRT-PCR. Moreover, target prediction, GO annotation and pathway analysis were carried out to understand the role of miRNAs in An. sinensis. We identified 49 conserved miRNAs and 12 novel miRNAs by next-generation high-throughput sequencing technology. In contrast, 43 miRNAs were predicted by the bioinformatic approach, of which two were assigned as novel. Comparative analysis of miRNA profiles by two approaches showed that 21 miRNAs were shared between them. Twelve novel miRNAs did not match any known miRNAs of any organism, indicating that they are possibly species-specific. Forty miRNAs were found in many mosquito species, indicating that these miRNAs are evolutionally conserved and may have critical roles in the process of life. Both the selected known and novel miRNAs (asi-miR-281, asi-miR-184, asi-miR-14, asi-miR-nov5, asi-miR-nov4, asi-miR-9383, and asi-miR-2a) could be detected by quantitative real-time PCR (qRT-PCR) in the sequenced sample, and the expression patterns of these miRNAs measured by qRT-PCR were in concordance with the original miRNA sequencing data. The predicted targets for the known and the novel miRNAs covered many important biological roles and pathways indicating the diversity of miRNA functions. We also found 21 conserved miRNAs and eight counterparts of target immune pathway genes in An. sinensis

  10. A Mathematical Optimization Problem in Bioinformatics

    Science.gov (United States)

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  11. Genetic Markers Analyses and Bioinformatic Approaches to Distinguish Between Olive Tree (Olea europaea L.) Cultivars.

    Science.gov (United States)

    Ben Ayed, Rayda; Ben Hassen, Hanen; Ennouri, Karim; Rebai, Ahmed

    2016-12-01

    The genetic diversity of 22 olive tree cultivars (Olea europaea L.) sampled from different Mediterranean countries was assessed using 5 SNP markers (FAD2.1; FAD2.3; CALC; SOD and ANTHO3) located in four different genes. The genotyping analysis of the 22 cultivars with 5 SNP loci revealed 11 alleles (average 2.2 per allele). The dendrogram based on cultivar genotypes revealed three clusters consistent with the cultivars classification. Besides, the results obtained with the five SNPs were compared to those obtained with the SSR markers using bioinformatic analyses and by computing a cophenetic correlation coefficient, indicating the usefulness of the UPGMA method for clustering plant genotypes. Based on principal coordinate analysis using a similarity matrix, the first two coordinates, revealed 54.94 % of the total variance. This work provides a more comprehensive explanation of the diversity available in Tunisia olive cultivars, and an important contribution for olive breeding and olive oil authenticity.

  12. MLH1 Promoter Methylation and Prediction/Prognosis of Gastric Cancer: A Systematic Review and Meta and Bioinformatic Analysis.

    Science.gov (United States)

    Shen, Shixuan; Chen, Xiaohui; Li, Hao; Sun, Liping; Yuan, Yuan

    2018-01-01

    Background: The promoter methylation of MLH1 gene and gastric cancer (GC)has been investigated previously. To get a more credible conclusion, we performed a systematic review and meta and bioinformatic analysis to clarify the role of MLH1 methylation in the prediction and prognosis of GC. Methods: Eligible studies were targeted after searching the PubMed, Web of Science, Embase, BIOSIS, CNKI and Wanfang Data to collect the information of MLH1 methylation and GC. The link strength between the two was estimated by odds ratio with its 95% confidence interval. The Newcastle-Ottawa scale was used for quantity assessment . Subgroup and sensitivity analysis were conducted to explore sources of heterogeneity. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were employed for bioinformatics analysis on the correlation between MLH1 methylation and GC risk, clinicopathological behavior as well as prognosis. Results: 2365 GC and 1563 controls were included in the meta-analysis. The pooled OR of MLH1 methylation in GC was 4.895 (95% CI: 3.149-7.611, PMLH1 methylation enhanced GC risk but might not related with GC clinicopathological features and prognosis. Conclusion: MLH1 methylation is an alive biomarker for the prediction of GC and it might not affect GC behavior. Further study could be conducted to verify the impact of MLH1 methylation on GC prognosis.

  13. The growing need for microservices in bioinformatics

    Directory of Open Access Journals (Sweden)

    Christopher L Williams

    2016-01-01

    Full Text Available Objective: Within the information technology (IT industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise′s overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework

  14. The growing need for microservices in bioinformatics.

    Science.gov (United States)

    Williams, Christopher L; Sica, Jeffrey C; Killen, Robert T; Balis, Ulysses G J

    2016-01-01

    Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Bioinformatics relies on nimble IT framework which can adapt to changing requirements. To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics. Use of the microservices framework is an effective methodology for the fabrication and

  15. The growing need for microservices in bioinformatics

    Science.gov (United States)

    Williams, Christopher L.; Sica, Jeffrey C.; Killen, Robert T.; Balis, Ulysses G. J.

    2016-01-01

    Objective: Within the information technology (IT) industry, best practices and standards are constantly evolving and being refined. In contrast, computer technology utilized within the healthcare industry often evolves at a glacial pace, with reduced opportunities for justified innovation. Although the use of timely technology refreshes within an enterprise's overall technology stack can be costly, thoughtful adoption of select technologies with a demonstrated return on investment can be very effective in increasing productivity and at the same time, reducing the burden of maintenance often associated with older and legacy systems. In this brief technical communication, we introduce the concept of microservices as applied to the ecosystem of data analysis pipelines. Microservice architecture is a framework for dividing complex systems into easily managed parts. Each individual service is limited in functional scope, thereby conferring a higher measure of functional isolation and reliability to the collective solution. Moreover, maintenance challenges are greatly simplified by virtue of the reduced architectural complexity of each constitutive module. This fact notwithstanding, rendered overall solutions utilizing a microservices-based approach provide equal or greater levels of functionality as compared to conventional programming approaches. Bioinformatics, with its ever-increasing demand for performance and new testing algorithms, is the perfect use-case for such a solution. Moreover, if promulgated within the greater development community as an open-source solution, such an approach holds potential to be transformative to current bioinformatics software development. Context: Bioinformatics relies on nimble IT framework which can adapt to changing requirements. Aims: To present a well-established software design and deployment strategy as a solution for current challenges within bioinformatics Conclusions: Use of the microservices framework is an effective

  16. Bioinformatics and Cancer

    Science.gov (United States)

    Researchers take on challenges and opportunities to mine "Big Data" for answers to complex biological questions. Learn how bioinformatics uses advanced computing, mathematics, and technological platforms to store, manage, analyze, and understand data.

  17. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    Energy Technology Data Exchange (ETDEWEB)

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated Microbial Genomes Annotation Collaboration Toolkit (IMG-ACT). It was developed by instructors from both research-intensive and predominately undergraduate institutions in collaboration with the Department of Energy-Joint Genome Institute (DOE-JGI) as a means to innovate and update undergraduate education and faculty development. The IMG-ACT program provides a cadre of tools, including access to a clearinghouse of genome sequences, bioinformatics databases, data storage, instructor course management, and student notebooks for organizing the results of their bioinformatic investigations. In the process, IMG-ACT makes it feasible to provide undergraduate research opportunities to a greater number and diversity of students, in contrast to the traditional mentor-to-student apprenticeship model for undergraduate research, which can be too expensive and time-consuming to provide for every undergraduate. The IMG-ACT serves as the hub for the network of faculty and students that use the system for microbial genome analysis. Open access of the IMG-ACT infrastructure to participating schools ensures that all types of higher education institutions can utilize it. With the infrastructure in place, faculty can focus their efforts on the pedagogy of bioinformatics, involvement of students in research, and use of this tool for their own research agenda. What the original faculty members of the IMG-ACT development team present here is an overview of how the IMG-ACT program has affected our development in terms of teaching and research with the hopes that it will inspire more faculty to get involved.

  18. Open discovery: An integrated live Linux platform of Bioinformatics tools.

    Science.gov (United States)

    Vetrivel, Umashankar; Pilla, Kalabharath

    2008-01-01

    Historically, live linux distributions for Bioinformatics have paved way for portability of Bioinformatics workbench in a platform independent manner. Moreover, most of the existing live Linux distributions limit their usage to sequence analysis and basic molecular visualization programs and are devoid of data persistence. Hence, open discovery - a live linux distribution has been developed with the capability to perform complex tasks like molecular modeling, docking and molecular dynamics in a swift manner. Furthermore, it is also equipped with complete sequence analysis environment and is capable of running windows executable programs in Linux environment. Open discovery portrays the advanced customizable configuration of fedora, with data persistency accessible via USB drive or DVD. The Open Discovery is distributed free under Academic Free License (AFL) and can be downloaded from http://www.OpenDiscovery.org.in.

  19. Combining medical informatics and bioinformatics toward tools for personalized medicine.

    Science.gov (United States)

    Sarachan, B D; Simmons, M K; Subramanian, P; Temkin, J M

    2003-01-01

    Key bioinformatics and medical informatics research areas need to be identified to advance knowledge and understanding of disease risk factors and molecular disease pathology in the 21 st century toward new diagnoses, prognoses, and treatments. Three high-impact informatics areas are identified: predictive medicine (to identify significant correlations within clinical data using statistical and artificial intelligence methods), along with pathway informatics and cellular simulations (that combine biological knowledge with advanced informatics to elucidate molecular disease pathology). Initial predictive models have been developed for a pilot study in Huntington's disease. An initial bioinformatics platform has been developed for the reconstruction and analysis of pathways, and work has begun on pathway simulation. A bioinformatics research program has been established at GE Global Research Center as an important technology toward next generation medical diagnostics. We anticipate that 21 st century medical research will be a combination of informatics tools with traditional biology wet lab research, and that this will translate to increased use of informatics techniques in the clinic.

  20. GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training.

    Science.gov (United States)

    Attwood, Teresa K; Atwood, Teresa K; Bongcam-Rudloff, Erik; Brazas, Michelle E; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M; Schneider, Maria Victoria; van Gelder, Celia W G

    2015-04-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all.

  1. GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Science.gov (United States)

    Atwood, Teresa K.; Bongcam-Rudloff, Erik; Brazas, Michelle E.; Corpas, Manuel; Gaudet, Pascale; Lewitter, Fran; Mulder, Nicola; Palagi, Patricia M.; Schneider, Maria Victoria; van Gelder, Celia W. G.

    2015-01-01

    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all. PMID:25856076

  2. Biology in 'silico': The Bioinformatics Revolution.

    Science.gov (United States)

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  3. Bioinformatics research in the Asia Pacific: a 2007 update.

    Science.gov (United States)

    Ranganathan, Shoba; Gribskov, Michael; Tan, Tin Wee

    2008-01-01

    We provide a 2007 update on the bioinformatics research in the Asia-Pacific from the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation set up in 1998. From 2002, APBioNet has organized the first International Conference on Bioinformatics (InCoB) bringing together scientists working in the field of bioinformatics in the region. This year, the InCoB2007 Conference was organized as the 6th annual conference of the Asia-Pacific Bioinformatics Network, on Aug. 27-30, 2007 at Hong Kong, following a series of successful events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea) and New Delhi (India). Besides a scientific meeting at Hong Kong, satellite events organized are a pre-conference training workshop at Hanoi, Vietnam and a post-conference workshop at Nansha, China. This Introduction provides a brief overview of the peer-reviewed manuscripts accepted for publication in this Supplement. We have organized the papers into thematic areas, highlighting the growing contribution of research excellence from this region, to global bioinformatics endeavours.

  4. Nispero: a cloud-computing based Scala tool specially suited for bioinformatics data processing

    OpenAIRE

    Evdokim Kovach; Alexey Alekhin; Eduardo Pareja Tobes; Raquel Tobes; Eduardo Pareja; Marina Manrique

    2014-01-01

    Nowadays it is widely accepted that the bioinformatics data analysis is a real bottleneck in many research activities related to life sciences. High-throughput technologies like Next Generation Sequencing (NGS) have completely reshaped the biology and bioinformatics landscape. Undoubtedly NGS has allowed important progress in many life-sciences related fields but has also presented interesting challenges in terms of computation capabilities and algorithms. Many kinds of tasks related with NGS...

  5. Adaptation of a Bioinformatics Microarray Analysis Workflow for a Toxicogenomic Study in Rainbow Trout.

    Directory of Open Access Journals (Sweden)

    Sophie Depiereux

    Full Text Available Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L of ethynylestradiol (EE2 and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs, which were subjected to over-representation analysis methods (ORA. Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and

  6. The Comprehension Problems of Children with Poor Reading Comprehension despite Adequate Decoding: A Meta-Analysis.

    Science.gov (United States)

    Spencer, Mercedes; Wagner, Richard K

    2018-06-01

    The purpose of this meta-analysis was to examine the comprehension problems of children who have a specific reading comprehension deficit (SCD), which is characterized by poor reading comprehension despite adequate decoding. The meta-analysis included 86 studies of children with SCD who were assessed in reading comprehension and oral language (vocabulary, listening comprehension, storytelling ability, and semantic and syntactic knowledge). Results indicated that children with SCD had deficits in oral language ( d = -0.78, 95% CI [-0.89, -0.68], but these deficits were not as severe as their deficit in reading comprehension ( d = -2.78, 95% CI [-3.01, -2.54]). When compared to reading comprehension age-matched normal readers, the oral language skills of the two groups were comparable ( d = 0.32, 95% CI [-0.49, 1.14]), which suggests that the oral language weaknesses of children with SCD represent a developmental delay rather than developmental deviance. Theoretical and practical implications of these findings are discussed.

  7. Challenge: A Multidisciplinary Degree Program in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Mudasser Fraz Wyne

    2006-06-01

    Full Text Available Bioinformatics is a new field that is poorly served by any of the traditional science programs in Biology, Computer science or Biochemistry. Known to be a rapidly evolving discipline, Bioinformatics has emerged from experimental molecular biology and biochemistry as well as from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. While institutions are responding to this increased demand by establishing graduate programs in bioinformatics, entrance barriers for these programs are high, largely due to the significant prerequisite knowledge which is required, both in the fields of biochemistry and computer science. Although many schools currently have or are proposing graduate programs in bioinformatics, few are actually developing new undergraduate programs. In this paper I explore the blend of a multidisciplinary approach, discuss the response of academia and highlight challenges faced by this emerging field.

  8. Bioinformatics analysis of the phytoene synthase gene in cabbage (Brassica oleracea var. capitata)

    Science.gov (United States)

    Sun, Bo; Jiang, Min; Xue, Shengling; Zheng, Aihong; Zhang, Fen; Tang, Haoru

    2018-04-01

    Phytoene Synthase (PSY) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata PSY (BocPSY) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocPSY1, BocPSY2 and BocPSY3 genes mapped to chromosomes 2,3 and 9, and contains an open reading frame of 1,248 bp, 1,266 bp and 1,275 bp that encodes a 415, 421, 424 amino acid protein, respectively. Subcellular localization predicted all BocPSY genes were in the chloroplast. The conserved domain of the BocPSY protein is PLN02632. Homology analysis indicates that the levels of identity among BocPSYs were all more than 85%, and the PSY protein is apparently conserved during plant evolution. The findings of the present study provide a molecular basis for the elucidation of PSY gene function in cabbage.

  9. OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis.

    Science.gov (United States)

    Khan, Mohammad Ibrahim; Sheel, Chotan

    2013-01-01

    Storage of sequence data is a big concern as the amount of data generated is exponential in nature at several locations. Therefore, there is a need to develop techniques to store data using compression algorithm. Here we describe optimal storage algorithm (OPTSDNA) for storing large amount of DNA sequences of varying length. This paper provides performance analysis of optimal storage algorithm (OPTSDNA) of a distributed bioinformatics computing system for analysis of DNA sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequences into database. DNA sequences of different lengths were stored by using this algorithm. These input DNA sequences are varied in size from very small to very large. Storage size is calculated by this algorithm. Response time is also calculated in this work. The efficiency and performance of the algorithm is high (in size calculation with percentage) when compared with other known with sequential approach.

  10. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    Science.gov (United States)

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.

  11. When cloud computing meets bioinformatics: a review.

    Science.gov (United States)

    Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

    2013-10-01

    In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.

  12. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    OpenAIRE

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; Oven, Mannis; Wallace, D.C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J.; Gai, Xiaowu

    2016-01-01

    textabstractMSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR ...

  13. Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline

    Science.gov (United States)

    2016-10-01

    AWARD NUMBER: W81XWH-15-1-0342 TITLE: Toward Personalized Pressure Ulcer Care Planning: Development of a Bioinformatics System for Individualized...Planning: Development of a Bioinformatics System for Individualized Prioritization of Clinical Pratice Guideline 5a. CONTRACT NUMBER 5b. GRANT...recommendations of CPG has been identified by experts in the field. We will use bioinformatics to enable data extraction, storage, and analysis to support

  14. An overview of topic modeling and its current applications in bioinformatics.

    Science.gov (United States)

    Liu, Lin; Tang, Lin; Dong, Wen; Yao, Shaowen; Zhou, Wei

    2016-01-01

    With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers' ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research.

  15. Bioinformatics Training: A Review of Challenges, Actions and Support Requirements

    DEFF Research Database (Denmark)

    Schneider, M.V.; Watson, J.; Attwood, T.

    2010-01-01

    As bioinformatics becomes increasingly central to research in the molecular life sciences, the need to train non-bioinformaticians to make the most of bioinformatics resources is growing. Here, we review the key challenges and pitfalls to providing effective training for users of bioinformatics...... services, and discuss successful training strategies shared by a diverse set of bioinformatics trainers. We also identify steps that trainers in bioinformatics could take together to advance the state of the art in current training practices. The ideas presented in this article derive from the first...

  16. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics

    International Nuclear Information System (INIS)

    Taylor, Ronald C.

    2010-01-01

    Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.

  17. 2nd Colombian Congress on Computational Biology and Bioinformatics

    CERN Document Server

    Cristancho, Marco; Isaza, Gustavo; Pinzón, Andrés; Rodríguez, Juan

    2014-01-01

    This volume compiles accepted contributions for the 2nd Edition of the Colombian Computational Biology and Bioinformatics Congress CCBCOL, after a rigorous review process in which 54 papers were accepted for publication from 119 submitted contributions. Bioinformatics and Computational Biology are areas of knowledge that have emerged due to advances that have taken place in the Biological Sciences and its integration with Information Sciences. The expansion of projects involving the study of genomes has led the way in the production of vast amounts of sequence data which needs to be organized, analyzed and stored to understand phenomena associated with living organisms related to their evolution, behavior in different ecosystems, and the development of applications that can be derived from this analysis.  .

  18. Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers

    DEFF Research Database (Denmark)

    Schneider, Maria V.; Walter, Peter; Blatter, Marie-Claude

    2012-01-01

    and clearly tagged in relation to target audiences, learning objectives, etc. Ideally, they would also be peer reviewed, and easily and efficiently accessible for downloading. Here, we present the Bioinformatics Training Network (BTN), a new enterprise that has been initiated to address these needs and review...

  19. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  20. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    Science.gov (United States)

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression.

  1. Bioinformatics education dissemination with an evolutionary problem solving perspective.

    Science.gov (United States)

    Jungck, John R; Donovan, Samuel S; Weisstein, Anton E; Khiripet, Noppadon; Everse, Stephen J

    2010-11-01

    Bioinformatics is central to biology education in the 21st century. With the generation of terabytes of data per day, the application of computer-based tools to stored and distributed data is fundamentally changing research and its application to problems in medicine, agriculture, conservation and forensics. In light of this 'information revolution,' undergraduate biology curricula must be redesigned to prepare the next generation of informed citizens as well as those who will pursue careers in the life sciences. The BEDROCK initiative (Bioinformatics Education Dissemination: Reaching Out, Connecting and Knitting together) has fostered an international community of bioinformatics educators. The initiative's goals are to: (i) Identify and support faculty who can take leadership roles in bioinformatics education; (ii) Highlight and distribute innovative approaches to incorporating evolutionary bioinformatics data and techniques throughout undergraduate education; (iii) Establish mechanisms for the broad dissemination of bioinformatics resource materials and teaching models; (iv) Emphasize phylogenetic thinking and problem solving; and (v) Develop and publish new software tools to help students develop and test evolutionary hypotheses. Since 2002, BEDROCK has offered more than 50 faculty workshops around the world, published many resources and supported an environment for developing and sharing bioinformatics education approaches. The BEDROCK initiative builds on the established pedagogical philosophy and academic community of the BioQUEST Curriculum Consortium to assemble the diverse intellectual and human resources required to sustain an international reform effort in undergraduate bioinformatics education.

  2. p3d – Python module for structural bioinformatics

    Directory of Open Access Journals (Sweden)

    Fufezan Christian

    2009-08-01

    Full Text Available Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files. p3d's strength arises from the combination of a very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP tree, b set theory and c functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.

  3. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    Science.gov (United States)

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR…

  4. Bioinformatic analysis of functional differences between the immunoproteasome and the constitutive proteasome

    DEFF Research Database (Denmark)

    Kesmir, Can; van Noort, V.; de Boer, R.J.

    2003-01-01

    not yet been quantified how different the specificity of two forms of the proteasome are. The main question, which still lacks direct evidence, is whether the immunoproteasome generates more MHC ligands. Here we use bioinformatics tools to quantify these differences and show that the immunoproteasome...

  5. Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.

    Science.gov (United States)

    Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J

    2013-01-01

    Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.

  6. Bioinformatics Identification of Modules of Transcription Factor Binding Sites in Alzheimer's Disease-Related Genes by In Silico Promoter Analysis and Microarrays

    Directory of Open Access Journals (Sweden)

    Regina Augustin

    2011-01-01

    Full Text Available The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases.

  7. Identification of the intestinal type gastric adenocarcinoma transcriptomic markers using bioinformatic and gene expression analysis

    Directory of Open Access Journals (Sweden)

    V. V. Volkomorov

    2017-01-01

    Full Text Available Introduction. Searching for specific and sensitive molecular tumor markers is one of the important tasks of modern oncology. These markers can be used for early tumor diagnosis and prognosis as well as for prediction of therapeutic response, estimation of tumor volume or to assess disease recurrence through monitoring. Gene expression data base mining followed by experimental validation of results obtained is one of the promising approaches for searching of that kind.Objective: to identify several membrane proteins which can be used for serum diagnosis of intestinal type of gastric adenocarcinoma.Materials and methods. We used bioinformatic-driven search using Gene Ontology and The Cancer Genome Atlas (TCGA data to identify mRNA up-regulated in gastric cancer (GC. Then, the expression levels of the mRNAs in 55 pare clinical specimens were investigated using reverse transcription polymerase chain reaction.Results. Comparative analysis of the mRNA levels in normal and tumor tissues using a new bioinformatics algorithm allowed to identify 3 high-copy transcripts (SULF1, PMEPA1 and SPARC, intracellular content of which markedly increased in GC. Expression analysis of these genes in clinical specimens showed significantly higher mRNA levels of PMEPA1 and SPARC in tumor as compared to normal gastric tissue. Interestingly more than twofold increase in expression level of these genes was observed in 75 % of intestinal-type GC. The same results were found only in 25 and 38 % of diffuse-type GC respectively.Conclusions. As a result of original bioinforamtic analysis using TCGA data base two genes (PMEPA1 and SPARC were shown to be significantly upregulated in intestinal-type gastric adenocarcinoma. The findings show the importance of further investigation to clarify the clinical value of their expression level in stomach tumors as well as their role in carcinogenesis.

  8. Fuzzy Logic in Medicine and Bioinformatics

    Directory of Open Access Journals (Sweden)

    Angela Torres

    2006-01-01

    Full Text Available The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions and in bioinformatics (comparison of genomes.

  9. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Science.gov (United States)

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  10. Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans.

    Directory of Open Access Journals (Sweden)

    Daryanaz Dargahi

    Full Text Available BACKGROUND: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%. Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46 compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344 or fruit fly D. melanogaster (n=84. Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies. METHODOLOGY/PRINCIPAL FINDINGS: This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans. CONCLUSIONS/SIGNIFICANCE: This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.

  11. BioQueue: a novel pipeline framework to accelerate bioinformatics analysis.

    Science.gov (United States)

    Yao, Li; Wang, Heming; Song, Yuanyuan; Sui, Guangchao

    2017-10-15

    With the rapid development of Next-Generation Sequencing, a large amount of data is now available for bioinformatics research. Meanwhile, the presence of many pipeline frameworks makes it possible to analyse these data. However, these tools concentrate mainly on their syntax and design paradigms, and dispatch jobs based on users' experience about the resources needed by the execution of a certain step in a protocol. As a result, it is difficult for these tools to maximize the potential of computing resources, and avoid errors caused by overload, such as memory overflow. Here, we have developed BioQueue, a web-based framework that contains a checkpoint before each step to automatically estimate the system resources (CPU, memory and disk) needed by the step and then dispatch jobs accordingly. BioQueue possesses a shell command-like syntax instead of implementing a new script language, which means most biologists without computer programming background can access the efficient queue system with ease. BioQueue is freely available at https://github.com/liyao001/BioQueue. The extensive documentation can be found at http://bioqueue.readthedocs.io. li_yao@outlook.com or gcsui@nefu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  12. Comprehensive analysis of transport aircraft flight performance

    Science.gov (United States)

    Filippone, Antonio

    2008-04-01

    This paper reviews the state-of-the art in comprehensive performance codes for fixed-wing aircraft. The importance of system analysis in flight performance is discussed. The paper highlights the role of aerodynamics, propulsion, flight mechanics, aeroacoustics, flight operation, numerical optimisation, stochastic methods and numerical analysis. The latter discipline is used to investigate the sensitivities of the sub-systems to uncertainties in critical state parameters or functional parameters. The paper discusses critically the data used for performance analysis, and the areas where progress is required. Comprehensive analysis codes can be used for mission fuel planning, envelope exploration, competition analysis, a wide variety of environmental studies, marketing analysis, aircraft certification and conceptual aircraft design. A comprehensive program that uses the multi-disciplinary approach for transport aircraft is presented. The model includes a geometry deck, a separate engine input deck with the main parameters, a database of engine performance from an independent simulation, and an operational deck. The comprehensive code has modules for deriving the geometry from bitmap files, an aerodynamics model for all flight conditions, a flight mechanics model for flight envelopes and mission analysis, an aircraft noise model and engine emissions. The model is validated at different levels. Validation of the aerodynamic model is done against the scale models DLR-F4 and F6. A general model analysis and flight envelope exploration are shown for the Boeing B-777-300 with GE-90 turbofan engines with intermediate passenger capacity (394 passengers in 2 classes). Validation of the flight model is done by sensitivity analysis on the wetted area (or profile drag), on the specific air range, the brake-release gross weight and the aircraft noise. A variety of results is shown, including specific air range charts, take-off weight-altitude charts, payload-range performance

  13. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  14. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  15. The development and application of bioinformatics core competencies to improve bioinformatics training and education.

    Science.gov (United States)

    Mulder, Nicola; Schwartz, Russell; Brazas, Michelle D; Brooksbank, Cath; Gaeta, Bruno; Morgan, Sarah L; Pauley, Mark A; Rosenwald, Anne; Rustici, Gabriella; Sierk, Michael; Warnow, Tandy; Welch, Lonnie

    2018-02-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans.

  16. The development and application of bioinformatics core competencies to improve bioinformatics training and education

    Science.gov (United States)

    Brooksbank, Cath; Morgan, Sarah L.; Rosenwald, Anne; Warnow, Tandy; Welch, Lonnie

    2018-01-01

    Bioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas. The Curriculum Task Force of the International Society of Computational Biology (ISCB) Education Committee has sought to provide a framework for training needs and curricula in terms of a set of bioinformatics core competencies that cut across many user personas and training programs. The initial competencies developed based on surveys of employers and training programs have since been refined through a multiyear process of community engagement. This report describes the current status of the competencies and presents a series of use cases illustrating how they are being applied in diverse training contexts. These use cases are intended to demonstrate how others can make use of the competencies and engage in the process of their continuing refinement and application. The report concludes with a consideration of remaining challenges and future plans. PMID:29390004

  17. The structural bioinformatics library: modeling in biomolecular science and beyond.

    Science.gov (United States)

    Cazals, Frédéric; Dreyfus, Tom

    2017-04-01

    Software in structural bioinformatics has mainly been application driven. To favor practitioners seeking off-the-shelf applications, but also developers seeking advanced building blocks to develop novel applications, we undertook the design of the Structural Bioinformatics Library ( SBL , http://sbl.inria.fr ), a generic C ++/python cross-platform software library targeting complex problems in structural bioinformatics. Its tenet is based on a modular design offering a rich and versatile framework allowing the development of novel applications requiring well specified complex operations, without compromising robustness and performances. The SBL involves four software components (1-4 thereafter). For end-users, the SBL provides ready to use, state-of-the-art (1) applications to handle molecular models defined by unions of balls, to deal with molecular flexibility, to model macro-molecular assemblies. These applications can also be combined to tackle integrated analysis problems. For developers, the SBL provides a broad C ++ toolbox with modular design, involving core (2) algorithms , (3) biophysical models and (4) modules , the latter being especially suited to develop novel applications. The SBL comes with a thorough documentation consisting of user and reference manuals, and a bugzilla platform to handle community feedback. The SBL is available from http://sbl.inria.fr. Frederic.Cazals@inria.fr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. 4273π: bioinformatics education on low cost ARM hardware.

    Science.gov (United States)

    Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D

    2013-08-12

    Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.

  19. OpenHelix: bioinformatics education outside of a different box.

    Science.gov (United States)

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review.

  20. In Silico Identification, Phylogenetic and Bioinformatic Analysis of Argonaute Genes in Plants

    Directory of Open Access Journals (Sweden)

    Khaled Mirzaei

    2014-01-01

    Full Text Available Argonaute protein family is the key players in pathways of gene silencing and small regulatory RNAs in different organisms. Argonaute proteins can bind small noncoding RNAs and control protein synthesis, affect messenger RNA stability, and even participate in the production of new forms of small RNAs. The aim of this study was to characterize and perform bioinformatic analysis of Argonaute proteins in 32 plant species that their genome was sequenced. A total of 437 Argonaute genes were identified and were analyzed based on lengths, gene structure, and protein structure. Results showed that Argonaute proteins were highly conserved across plant kingdom. Phylogenic analysis divided plant Argonautes into three classes. Argonaute proteins have three conserved domains PAZ, MID and PIWI. In addition to three conserved domains namely, PAZ, MID, and PIWI, we identified few more domains in AGO of some plant species. Expression profile analysis of Argonaute proteins showed that expression of these genes varies in most of tissues, which means that these proteins are involved in regulation of most pathways of the plant system. Numbers of alternative transcripts of Argonaute genes were highly variable among the plants. A thorough analysis of large number of putative Argonaute genes revealed several interesting aspects associated with this protein and brought novel information with promising usefulness for both basic and biotechnological applications.

  1. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets.

    Science.gov (United States)

    Rideout, Jai Ram; Chase, John H; Bolyen, Evan; Ackermann, Gail; González, Antonio; Knight, Rob; Caporaso, J Gregory

    2016-06-13

    Bioinformatics software often requires human-generated tabular text files as input and has specific requirements for how those data are formatted. Users frequently manage these data in spreadsheet programs, which is convenient for researchers who are compiling the requisite information because the spreadsheet programs can easily be used on different platforms including laptops and tablets, and because they provide a familiar interface. It is increasingly common for many different researchers to be involved in compiling these data, including study coordinators, clinicians, lab technicians and bioinformaticians. As a result, many research groups are shifting toward using cloud-based spreadsheet programs, such as Google Sheets, which support the concurrent editing of a single spreadsheet by different users working on different platforms. Most of the researchers who enter data are not familiar with the formatting requirements of the bioinformatics programs that will be used, so validating and correcting file formats is often a bottleneck prior to beginning bioinformatics analysis. We present Keemei, a Google Sheets Add-on, for validating tabular files used in bioinformatics analyses. Keemei is available free of charge from Google's Chrome Web Store. Keemei can be installed and run on any web browser supported by Google Sheets. Keemei currently supports the validation of two widely used tabular bioinformatics formats, the Quantitative Insights into Microbial Ecology (QIIME) sample metadata mapping file format and the Spatially Referenced Genetic Data (SRGD) format, but is designed to easily support the addition of others. Keemei will save researchers time and frustration by providing a convenient interface for tabular bioinformatics file format validation. By allowing everyone involved with data entry for a project to easily validate their data, it will reduce the validation and formatting bottlenecks that are commonly encountered when human-generated data files are

  2. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    Science.gov (United States)

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  3. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    Science.gov (United States)

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  4. Translational Bioinformatics and Clinical Research (Biomedical) Informatics.

    Science.gov (United States)

    Sirintrapun, S Joseph; Zehir, Ahmet; Syed, Aijazuddin; Gao, JianJiong; Schultz, Nikolaus; Cheng, Donavan T

    2015-06-01

    Translational bioinformatics and clinical research (biomedical) informatics are the primary domains related to informatics activities that support translational research. Translational bioinformatics focuses on computational techniques in genetics, molecular biology, and systems biology. Clinical research (biomedical) informatics involves the use of informatics in discovery and management of new knowledge relating to health and disease. This article details 3 projects that are hybrid applications of translational bioinformatics and clinical research (biomedical) informatics: The Cancer Genome Atlas, the cBioPortal for Cancer Genomics, and the Memorial Sloan Kettering Cancer Center clinical variants and results database, all designed to facilitate insights into cancer biology and clinical/therapeutic correlations. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Development of a cloud-based Bioinformatics Training Platform.

    Science.gov (United States)

    Revote, Jerico; Watson-Haigh, Nathan S; Quenette, Steve; Bethwaite, Blair; McGrath, Annette; Shang, Catherine A

    2017-05-01

    The Bioinformatics Training Platform (BTP) has been developed to provide access to the computational infrastructure required to deliver sophisticated hands-on bioinformatics training courses. The BTP is a cloud-based solution that is in active use for delivering next-generation sequencing training to Australian researchers at geographically dispersed locations. The BTP was built to provide an easy, accessible, consistent and cost-effective approach to delivering workshops at host universities and organizations with a high demand for bioinformatics training but lacking the dedicated bioinformatics training suites required. To support broad uptake of the BTP, the platform has been made compatible with multiple cloud infrastructures. The BTP is an open-source and open-access resource. To date, 20 training workshops have been delivered to over 700 trainees at over 10 venues across Australia using the BTP. © The Author 2016. Published by Oxford University Press.

  6. Advanced complex analysis a comprehensive course in analysis, part 2b

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 2B provides a comprehensive look at a number of subjects of complex analysis not included in Part 2A. Presented in this volume are the theory of conformal metrics (includ

  7. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    Science.gov (United States)

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  8. Structural and Phylogenetic Analysis of Laccases from Trichoderma: A Bioinformatic Approach

    Science.gov (United States)

    Cázares-García, Saila Viridiana; Vázquez-Garcidueñas, Ma. Soledad; Vázquez-Marrufo, Gerardo

    2013-01-01

    The genus Trichoderma includes species of great biotechnological value, both for their mycoparasitic activities and for their ability to produce extracellular hydrolytic enzymes. Although activity of extracellular laccase has previously been reported in Trichoderma spp., the possible number of isoenzymes is still unknown, as are the structural and functional characteristics of both the genes and the putative proteins. In this study, the system of laccases sensu stricto in the Trichoderma species, the genomes of which are publicly available, were analyzed using bioinformatic tools. The intron/exon structure of the genes and the identification of specific motifs in the sequence of amino acids of the proteins generated in silico allow for clear differentiation between extracellular and intracellular enzymes. Phylogenetic analysis suggests that the common ancestor of the genus possessed a functional gene for each one of these enzymes, which is a characteristic preserved in T. atroviride and T. virens. This analysis also reveals that T. harzianum and T. reesei only retained the intracellular activity, whereas T. asperellum added an extracellular isoenzyme acquired through horizontal gene transfer during the mycoparasitic process. The evolutionary analysis shows that in general, extracellular laccases are subjected to purifying selection, and intracellular laccases show neutral evolution. The data provided by the present study will enable the generation of experimental approximations to better understand the physiological role of laccases in the genus Trichoderma and to increase their biotechnological potential. PMID:23383142

  9. Extending Asia Pacific bioinformatics into new realms in the "-omics" era.

    Science.gov (United States)

    Ranganathan, Shoba; Eisenhaber, Frank; Tong, Joo Chuan; Tan, Tin Wee

    2009-12-03

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation dating back to 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 7-11, 2009 at Biopolis, Singapore. Besides bringing together scientists from the field of bioinformatics in this region, InCoB has actively engaged clinicians and researchers from the area of systems biology, to facilitate greater synergy between these two groups. InCoB2009 followed on from a series of successful annual events in Bangkok (Thailand), Penang (Malaysia), Auckland (New Zealand), Busan (South Korea), New Delhi (India), Hong Kong and Taipei (Taiwan), with InCoB2010 scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. The Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and symposia on Clinical Bioinformatics (CBAS), the Singapore Symposium on Computational Biology (SYMBIO) and training tutorials were scheduled prior to the scientific meeting, and provided ample opportunity for in-depth learning and special interest meetings for educators, clinicians and students. We provide a brief overview of the peer-reviewed bioinformatics manuscripts accepted for publication in this supplement, grouped into thematic areas. In order to facilitate scientific reproducibility and accountability, we have, for the first time, introduced minimum information criteria for our pubilcations, including compliance to a Minimum Information about a Bioinformatics Investigation (MIABi). As the regional research expertise in bioinformatics matures, we have delineated a minimum set of bioinformatics skills required for addressing the computational challenges of the "-omics" era.

  10. Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools

    Science.gov (United States)

    Diaz Acosta, B.

    2011-01-01

    The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.

  11. Concepts and introduction to RNA bioinformatics

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.; Ruzzo, Walter L.

    2014-01-01

    RNA bioinformatics and computational RNA biology have emerged from implementing methods for predicting the secondary structure of single sequences. The field has evolved to exploit multiple sequences to take evolutionary information into account, such as compensating (and structure preserving) base...... for interactions between RNA and proteins.Here, we introduce the basic concepts of predicting RNA secondary structure relevant to the further analyses of RNA sequences. We also provide pointers to methods addressing various aspects of RNA bioinformatics and computational RNA biology....

  12. Bioinformatics Analysis of NBS-LRR Encoding Resistance Genes in Setaria italica.

    Science.gov (United States)

    Zhao, Yan; Weng, Qiaoyun; Song, Jinhui; Ma, Hailian; Yuan, Jincheng; Dong, Zhiping; Liu, Yinghui

    2016-06-01

    In plants, resistance (R) genes are involved in pathogen recognition and subsequent activation of innate immune responses. The nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes family forms the largest R-gene family among plant genomes and play an important role in plant disease resistance. In this paper, comprehensive analysis of NBS-encoding genes is performed in the whole Setaria italica genome. A total of 96 NBS-LRR genes are identified, and comprehensive overview of the NBS-LRR genes is undertaken, including phylogenetic analysis, chromosome locations, conserved motifs of proteins, and gene expression. Based on the domain, these genes are divided into two groups and distributed in all Setaria italica chromosomes. Most NBS-LRR genes are located at the distal tip of the long arms of the chromosomes. Setaria italica NBS-LRR proteins share at least one nucleotide-biding domain and one leucine-rich repeat domain. Our results also show the duplication of NBS-LRR genes in Setaria italica is related to their gene structure.

  13. Adapting bioinformatics curricula for big data

    Science.gov (United States)

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  14. Bioinformatics analysis of the ς-carotene desaturase gene in cabbage (Brassica oleracea var. capitata)

    Science.gov (United States)

    Sun, Bo; Zheng, Aihong; Jiang, Min; Xue, Shengling; Zhang, Fen; Tang, Haoru

    2018-04-01

    ς-carotene desaturase (ZDS) is an important enzyme in carotenoid biosynthesis. Here, the Brassica oleracea var. capitata ZDS (BocZDS) gene sequences were obtained from Brassica database (BRAD), and preformed for bioinformatics analysis. The BocZDS gene mapped to Scaffold000363, and contains an open reading frame of 1,686 bp that encodes a 561-amino acid protein with a calculated molecular mass of 62.00 kD and an isoelectric point (pI) of 8.2. Subcellular localization predicted the BocZDS gene was in the chloroplast. The conserved domain of the BocZDS protein is PLN02487, indicating that it belongs the member of zeta-carotene desaturase. Homology analysis indicates that the ZDS protein is apparently conserved during plant evolution and is most closely related to B. oleracea var. oleracea, B. napus, and B. rapa. The findings of the present study provide a molecular basis for the elucidation of ZDS gene function in cabbage.

  15. 6th International Conference on Practical Applications of Computational Biology & Bioinformatics

    CERN Document Server

    Luscombe, Nicholas; Fdez-Riverola, Florentino; Rodríguez, Juan; Practical Applications of Computational Biology & Bioinformatics

    2012-01-01

    The growth in the Bioinformatics and Computational Biology fields over the last few years has been remarkable.. The analysis of the datasets of Next Generation Sequencing needs new algorithms and approaches from fields such as Databases, Statistics, Data Mining, Machine Learning, Optimization, Computer Science and Artificial Intelligence. Also Systems Biology has also been emerging as an alternative to the reductionist view that dominated biological research in the last decades. This book presents the results of the  6th International Conference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, 28-30th March, 2012 which brought together interdisciplinary scientists that have a strong background in the biological and computational sciences.

  16. Developing library bioinformatics services in context: the Purdue University Libraries bioinformationist program.

    Science.gov (United States)

    Rein, Diane C

    2006-07-01

    Purdue University is a major agricultural, engineering, biomedical, and applied life science research institution with an increasing focus on bioinformatics research that spans multiple disciplines and campus academic units. The Purdue University Libraries (PUL) hired a molecular biosciences specialist to discover, engage, and support bioinformatics needs across the campus. After an extended period of information needs assessment and environmental scanning, the specialist developed a week of focused bioinformatics instruction (Bioinformatics Week) to launch system-wide, library-based bioinformatics services. The specialist employed a two-tiered approach to assess user information requirements and expectations. The first phase involved careful observation and collection of information needs in-context throughout the campus, attending laboratory meetings, interviewing department chairs and individual researchers, and engaging in strategic planning efforts. Based on the information gathered during the integration phase, several survey instruments were developed to facilitate more critical user assessment and the recovery of quantifiable data prior to planning. Given information gathered while working with clients and through formal needs assessments, as well as the success of instructional approaches used in Bioinformatics Week, the specialist is developing bioinformatics support services for the Purdue community. The specialist is also engaged in training PUL faculty librarians in bioinformatics to provide a sustaining culture of library-based bioinformatics support and understanding of Purdue's bioinformatics-related decision and policy making.

  17. miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal

    Science.gov (United States)

    Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily

    2018-01-01

    Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355

  18. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    Science.gov (United States)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  19. Concepts Of Bioinformatics And Its Application In Veterinary ...

    African Journals Online (AJOL)

    Bioinformatics has advanced the course of research and future veterinary vaccines development because it has provided new tools for identification of vaccine targets from sequenced biological data of organisms. In Nigeria, there is lack of bioinformatics training in the universities, expect for short training courses in which ...

  20. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  1. Tools and data services registry: a community effort to document bioinformatics resources

    Science.gov (United States)

    Ison, Jon; Rapacki, Kristoffer; Ménager, Hervé; Kalaš, Matúš; Rydza, Emil; Chmura, Piotr; Anthon, Christian; Beard, Niall; Berka, Karel; Bolser, Dan; Booth, Tim; Bretaudeau, Anthony; Brezovsky, Jan; Casadio, Rita; Cesareni, Gianni; Coppens, Frederik; Cornell, Michael; Cuccuru, Gianmauro; Davidsen, Kristian; Vedova, Gianluca Della; Dogan, Tunca; Doppelt-Azeroual, Olivia; Emery, Laura; Gasteiger, Elisabeth; Gatter, Thomas; Goldberg, Tatyana; Grosjean, Marie; Grüning, Björn; Helmer-Citterich, Manuela; Ienasescu, Hans; Ioannidis, Vassilios; Jespersen, Martin Closter; Jimenez, Rafael; Juty, Nick; Juvan, Peter; Koch, Maximilian; Laibe, Camille; Li, Jing-Woei; Licata, Luana; Mareuil, Fabien; Mičetić, Ivan; Friborg, Rune Møllegaard; Moretti, Sebastien; Morris, Chris; Möller, Steffen; Nenadic, Aleksandra; Peterson, Hedi; Profiti, Giuseppe; Rice, Peter; Romano, Paolo; Roncaglia, Paola; Saidi, Rabie; Schafferhans, Andrea; Schwämmle, Veit; Smith, Callum; Sperotto, Maria Maddalena; Stockinger, Heinz; Vařeková, Radka Svobodová; Tosatto, Silvio C.E.; de la Torre, Victor; Uva, Paolo; Via, Allegra; Yachdav, Guy; Zambelli, Federico; Vriend, Gert; Rost, Burkhard; Parkinson, Helen; Løngreen, Peter; Brunak, Søren

    2016-01-01

    Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools. PMID:26538599

  2. Host-parasite interactions and ecology of the malaria parasite-a bioinformatics approach.

    Science.gov (United States)

    Izak, Dariusz; Klim, Joanna; Kaczanowski, Szymon

    2018-04-25

    Malaria remains one of the highest mortality infectious diseases. Malaria is caused by parasites from the genus Plasmodium. Most deaths are caused by infections involving Plasmodium falciparum, which has a complex life cycle. Malaria parasites are extremely well adapted for interactions with their host and their host's immune system and are able to suppress the human immune system, erase immunological memory and rapidly alter exposed antigens. Owing to this rapid evolution, parasites develop drug resistance and express novel forms of antigenic proteins that are not recognized by the host immune system. There is an emerging need for novel interventions, including novel drugs and vaccines. Designing novel therapies requires knowledge about host-parasite interactions, which is still limited. However, significant progress has recently been achieved in this field through the application of bioinformatics analysis of parasite genome sequences. In this review, we describe the main achievements in 'malarial' bioinformatics and provide examples of successful applications of protein sequence analysis. These examples include the prediction of protein functions based on homology and the prediction of protein surface localization via domain and motif analysis. Additionally, we describe PlasmoDB, a database that stores accumulated experimental data. This tool allows data mining of the stored information and will play an important role in the development of malaria science. Finally, we illustrate the application of bioinformatics in the development of population genetics research on malaria parasites, an approach referred to as reverse ecology.

  3. The Use of Next Generation Sequencing and Junction Sequence Analysis Bioinformatics to Achieve Molecular Characterization of Crops Improved Through Modern Biotechnology

    Directory of Open Access Journals (Sweden)

    David Kovalic

    2012-11-01

    Full Text Available The assessment of genetically modified (GM crops for regulatory approval currently requires a detailed molecular characterization of the DNA sequence and integrity of the transgene locus. In addition, molecular characterization is a critical component of event selection and advancement during product development. Typically, molecular characterization has relied on Southern blot analysis to establish locus and copy number along with targeted sequencing of polymerase chain reaction products spanning any inserted DNA to complete the characterization process. Here we describe the use of next generation (NexGen sequencing and junction sequence analysis bioinformatics in a new method for achieving full molecular characterization of a GM event without the need for Southern blot analysis. In this study, we examine a typical GM soybean [ (L. Merr.] line and demonstrate that this new method provides molecular characterization equivalent to the current Southern blot-based method. We also examine an event containing in vivo DNA rearrangement of multiple transfer DNA inserts to demonstrate that the new method is effective at identifying complex cases. Next generation sequencing and bioinformatics offers certain advantages over current approaches, most notably the simplicity, efficiency, and consistency of the method, and provides a viable alternative for efficiently and robustly achieving molecular characterization of GM crops.

  4. Molecular and bioinformatic analysis of the FB-NOF transposable element.

    Science.gov (United States)

    Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

    2006-04-12

    The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.

  5. Genetic and bioinformatic analysis of 41C and the 2R heterochromatin of Drosophila melanogaster: a window on the heterochromatin-euchromatin junction.

    OpenAIRE

    Myster, Steven H; Wang, Fei; Cavallo, Robert; Christian, Whitney; Bhotika, Seema; Anderson, Charles T; Peifer, Mark

    2004-01-01

    Genomic sequences provide powerful new tools in genetic analysis, making it possible to combine classical genetics with genomics to characterize the genes in a particular chromosome region. These approaches have been applied successfully to the euchromatin, but analysis of the heterochromatin has lagged somewhat behind. We describe a combined genetic and bioinformatics approach to the base of the right arm of the Drosophila melanogaster second chromosome, at the boundary between pericentric h...

  6. Bioinformatics approaches for identifying new therapeutic bioactive peptides in food

    Directory of Open Access Journals (Sweden)

    Nora Khaldi

    2012-10-01

    Full Text Available ABSTRACT:The traditional methods for mining foods for bioactive peptides are tedious and long. Similar to the drug industry, the length of time to identify and deliver a commercial health ingredient that reduces disease symptoms can take anything between 5 to 10 years. Reducing this time and effort is crucial in order to create new commercially viable products with clear and important health benefits. In the past few years, bioinformatics, the science that brings together fast computational biology, and efficient genome mining, is appearing as the long awaited solution to this problem. By quickly mining food genomes for characteristics of certain food therapeutic ingredients, researchers can potentially find new ones in a matter of a few weeks. Yet, surprisingly, very little success has been achieved so far using bioinformatics in mining for food bioactives.The absence of food specific bioinformatic mining tools, the slow integration of both experimental mining and bioinformatics, and the important difference between different experimental platforms are some of the reasons for the slow progress of bioinformatics in the field of functional food and more specifically in bioactive peptide discovery.In this paper I discuss some methods that could be easily translated, using a rational peptide bioinformatics design, to food bioactive peptide mining. I highlight the need for an integrated food peptide database. I also discuss how to better integrate experimental work with bioinformatics in order to improve the mining of food for bioactive peptides, therefore achieving a higher success rates.

  7. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    Science.gov (United States)

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Vertical and Horizontal Integration of Bioinformatics Education: A Modular, Interdisciplinary Approach

    Science.gov (United States)

    Furge, Laura Lowe; Stevens-Truss, Regina; Moore, D. Blaine; Langeland, James A.

    2009-01-01

    Bioinformatics education for undergraduates has been approached primarily in two ways: introduction of new courses with largely bioinformatics focus or introduction of bioinformatics experiences into existing courses. For small colleges such as Kalamazoo, creation of new courses within an already resource-stretched setting has not been an option.…

  9. Adapting bioinformatics curricula for big data.

    Science.gov (United States)

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. © The Author 2015. Published by Oxford University Press.

  10. Promoting synergistic research and education in genomics and bioinformatics.

    Science.gov (United States)

    Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping

    2008-01-01

    Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and

  11. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center

    Science.gov (United States)

    Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.

    2017-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627

  12. Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package.

    Science.gov (United States)

    El-Kalioby, Mohamed; Abouelhoda, Mohamed; Krüger, Jan; Giegerich, Robert; Sczyrba, Alexander; Wall, Dennis P; Tonellato, Peter

    2012-01-01

    Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.

  13. Analyses of Brucella pathogenesis, host immunity, and vaccine targets using systems biology and bioinformatics

    Directory of Open Access Journals (Sweden)

    Yongqun eHe

    2012-02-01

    Full Text Available Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of ten classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.

  14. Analyses of Brucella Pathogenesis, Host Immunity, and Vaccine Targets using Systems Biology and Bioinformatics

    Science.gov (United States)

    He, Yongqun

    2011-01-01

    Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning. PMID:22919594

  15. Molecular cloing and bioinformatics analysis of lactate dehydrogenase from Taenia multiceps.

    Science.gov (United States)

    Guo, Cheng; Wang, Yu; Huang, Xing; Wang, Ning; Yan, Ming; He, Ran; Gu, Xiaobin; Xie, Yue; Lai, Weimin; Jing, Bo; Peng, Xuerong; Yang, Guangyou

    2017-10-01

    Coenurus cerebralis, the larval stage (metacestode or coenurus) of Taenia multiceps, parasitizes sheep, goats, and other ruminants and causes coenurosis. In this study, we isolated and characterized complementary DNAs that encode lactate dehydrogenase A (Tm-LDHA) and B (Tm-LDHB) from the transcriptome of T. multiceps and expressed recombinant Tm-LDHB (rTm-LDHB) in Escherichia coli. Bioinformatic analysis showed that both Tm-LDH genes (LDHA and LDHB) contain a 996-bp open reading frame and encode a protein of 331 amino acids. After determination of the immunogenicity of the recombinant Tm-LDHB, an indirect enzyme-linked immunosorbent assay (ELISA) was developed for preliminary evaluation of the serodiagnostic potential of rTm-LDHB in goats. However, the rTm-LDHB-based indirect ELISA developed here exhibited specificity of only 71.42% (10/14) and sensitivity of 1:3200 in detection of goats infected with T. multiceps in the field. This study is the first to describe LDHA and LDHB of T. multiceps; meanwhile, our results indicate that rTm-LDHB is not a specific antigen candidate for immunodiagnosis of T. multiceps infection in goats.

  16. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    Science.gov (United States)

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  17. Design and bioinformatics analysis of novel biomimetic peptides as nanocarriers for gene transfer

    Directory of Open Access Journals (Sweden)

    Asia Majidi

    2015-01-01

    Full Text Available Objective(s: The introduction of nucleic acids into cells for therapeutic objectives is significantly hindered by the size and charge of these molecules and therefore requires efficient vectors that assist cellular uptake. For several years great efforts have been devoted to the study of development of recombinant vectors based on biological domains with potential applications in gene therapy. Such vectors have been synthesized in genetically engineered approach, resulting in biomacromolecules with new properties that are not present in nature. Materials and Methods: In this study, we have designed new peptides using homology modeling with the purpose of overcoming the cell barriers for successful gene delivery through Bioinformatics tools. Three different carriers were designed and one of those with better score through Bioinformatics tools was cloned, expressed and its affinity for pDNA was monitored. Results: The resultszz demonstrated that the vector can effectively condense pDNAinto nanoparticles with the average sizes about 100 nm. Conclusion: We hope these peptides can overcome the biological barriers associated with gene transfer, and mediate efficient gene delivery.

  18. Bioinformatics tools for quantitative and functional metagenome and metatranscriptome data analysis in microbes.

    Science.gov (United States)

    Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin

    2017-05-08

    Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  19. Functional and bioinformatics analysis of an exopolysaccharide-related gene (epsN) from Lactobacillus kefiranofaciens ZW3.

    Science.gov (United States)

    Wang, Jingrui; Tang, Wei; Zheng, Yongna; Xing, Zhuqing; Wang, Yanping

    2016-09-01

    A novel lactic acid bacteria strain Lactobacillus kefiranofaciens ZW3 exhibited the characteristics of high production of exopolysaccharide (EPS). The epsN gene, located in the eps gene cluster of this strain, is associated with EPS biosynthesis. Bioinformatics analysis of this gene was performed. The conserved domain analysis showed that the EpsN protein contained MATE-Wzx-like domains. Then the epsN gene was amplified to construct the recombinant expression vector pMG36e-epsN. The results showed that the EPS yields of the recombinants were significantly improved. By determining the yields of EPS and intracellular polysaccharide, it was considered that epsN gene could play its Wzx flippase role in the EPS biosynthesis. This is the first time to prove the effect of EpsN on L. kefiranofaciens EPS biosynthesis and further prove its functional property.

  20. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    Science.gov (United States)

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  1. Report on the EMBER Project--A European Multimedia Bioinformatics Educational Resource

    Science.gov (United States)

    Attwood, Terri K.; Selimas, Ioannis; Buis, Rob; Altenburg, Ruud; Herzog, Robert; Ledent, Valerie; Ghita, Viorica; Fernandes, Pedro; Marques, Isabel; Brugman, Marc

    2005-01-01

    EMBER was a European project aiming to develop bioinformatics teaching materials on the Web and CD-ROM to help address the recognised skills shortage in bioinformatics. The project grew out of pilot work on the development of an interactive web-based bioinformatics tutorial and the desire to repackage that resource with the help of a professional…

  2. CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

    Science.gov (United States)

    Lee, Mikyung; Kim, Yangseok

    2009-12-16

    Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square

  3. Bioinformatics Approach Based Research of Profile Protein Carbonic Anhydrase II Analysis as a Potential Candidate Cause Autism for The Variation of Learning Subjects Biotechnology

    Directory of Open Access Journals (Sweden)

    Dian Eka A. F. Ningrum

    2017-03-01

    Full Text Available This study aims to determine the needs of learning variations on Biotechnology courses using bioinformatics approaches. One example of applied use of bioinformatics in biotechnology course is the analysis of protein profiles carbonic anhydrase II as a potential cause of autism candidate. This research is a qualitative descriptive study consisted of two phases. The first phase of the data obtained from observations of learning, student questionnaires, and questionnaires lecturer. Results from the first phase, namely the need for variations learning in Biotechnology course using bioinformatics. Collecting data on the second stage uses three webserver to predict the target protein and scientific articles. Visualization of proteins using PyMOL software. 3 based webserver which is used, the candidate of target proteins associated with autism is carbonic anhydrase II. The survey results revealed that the protein carbonic anhydrase II as a potential candidate for the cause of autism classified metaloenzim are able to bind with heavy metals. The content of heavy metals in autistic patients high that affect metabolism. This prediction of protein candidate cause autism is applied use to solve the problem in society, so that can achieve the learning outcome in biotechnology course.

  4. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    Science.gov (United States)

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students’ knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  5. Recent developments in life sciences research: Role of bioinformatics

    African Journals Online (AJOL)

    Life sciences research and development has opened up new challenges and opportunities for bioinformatics. The contribution of bioinformatics advances made possible the mapping of the entire human genome and genomes of many other organisms in just over a decade. These discoveries, along with current efforts to ...

  6. Current status and future perspectives of bioinformatics in Tanzania ...

    African Journals Online (AJOL)

    The main bottleneck in advancing genomics in present times is the lack of expertise in using bioinformatics tools and approaches for data mining in raw DNA sequences generated by modern high throughput technologies such as next generation sequencing. Although bioinformatics has been making major progress and ...

  7. Ergatis: a web interface and scalable software system for bioinformatics workflows

    Science.gov (United States)

    Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.

    2010-01-01

    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634

  8. [Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

    Science.gov (United States)

    Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

    2015-04-01

    This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.

  9. Assessment of a Bioinformatics across Life Science Curricula Initiative

    Science.gov (United States)

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  10. Functional proteomics with new mass spectrometric and bioinformatics tools

    International Nuclear Information System (INIS)

    Kesners, P.W.A.

    2001-01-01

    A comprehensive range of mass spectrometric tools is required to investigate todays life science applications and a strong focus is on addressing the needs of functional proteomics. Application examples are given showing the streamlined process of protein identification from low femtomole amounts of digests. Sample preparation is achieved with a convertible robot for automated 2D gel picking, and MALDI target dispensing. MALDI-TOF or ESI-MS subsequent to enzymatic digestion. A choice of mass spectrometers including Q-q-TOF with multipass capability, MALDI-MS/MS with unsegmented PSD, Ion Trap and FT-MS are discussed for their respective strengths and applications. Bioinformatics software that allows both database work and novel peptide mass spectra interpretation is reviewed. The automated database searching uses either entire digest LC-MS n ESI Ion Trap data or MALDI MS and MS/MS spectra. It is shown how post translational modifications are interactively uncovered and de-novo sequencing of peptides is facilitated

  11. Bioinformatics, interaction network analysis, and neural networks to characterize gene expression of radicular cyst and periapical granuloma.

    Science.gov (United States)

    Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena

    2015-06-01

    Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  12. Genetic and bioinformatics analysis of four novel GCK missense variants detected in Caucasian families with GCK-MODY phenotype.

    Science.gov (United States)

    Costantini, S; Malerba, G; Contreas, G; Corradi, M; Marin Vargas, S P; Giorgetti, A; Maffeis, C

    2015-05-01

    Heterozygous loss-of-function mutations in the glucokinase (GCK) gene cause maturity-onset diabetes of the young (MODY) subtype GCK (GCK-MODY/MODY2). GCK sequencing revealed 16 distinct mutations (13 missense, 1 nonsense, 1 splice site, and 1 frameshift-deletion) co-segregating with hyperglycaemia in 23 GCK-MODY families. Four missense substitutions (c.718A>G/p.Asn240Asp, c.757G>T/p.Val253Phe, c.872A>C/p.Lys291Thr, and c.1151C>T/p.Ala384Val) were novel and a founder effect for the nonsense mutation (c.76C>T/p.Gln26*) was supposed. We tested whether an accurate bioinformatics approach could strengthen family-genetic evidence for missense variant pathogenicity in routine diagnostics, where wet-lab functional assays are generally unviable. In silico analyses of the novel missense variants, including orthologous sequence conservation, amino acid substitution (AAS)-pathogenicity predictors, structural modeling and splicing predictors, suggested that the AASs and/or the underlying nucleotide changes are likely to be pathogenic. This study shows how a careful bioinformatics analysis could provide effective suggestions to help molecular-genetic diagnosis in absence of wet-lab validations. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.

    Science.gov (United States)

    Krampis, Konstantinos; Booth, Tim; Chapman, Brad; Tiwari, Bela; Bicak, Mesude; Field, Dawn; Nelson, Karen E

    2012-03-19

    A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly

  14. Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community

    Science.gov (United States)

    2012-01-01

    Background A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the

  15. Bioinformatics analysis of differentially expressed proteins in prostate cancer based on proteomics data

    Directory of Open Access Journals (Sweden)

    Chen C

    2016-03-01

    Full Text Available Chen Chen,1 Li-Guo Zhang,1 Jian Liu,1 Hui Han,1 Ning Chen,1 An-Liang Yao,1 Shao-San Kang,1 Wei-Xing Gao,1 Hong Shen,2 Long-Jun Zhang,1 Ya-Peng Li,1 Feng-Hong Cao,1 Zhi-Guo Li3 1Department of Urology, North China University of Science and Technology Affiliated Hospital, 2Department of Modern Technology and Education Center, 3Department of Medical Research Center, International Science and Technology Cooperation Base of Geriatric Medicine, North China University of Science and Technology, Tangshan, People’s Republic of China Abstract: We mined the literature for proteomics data to examine the occurrence and metastasis of prostate cancer (PCa through a bioinformatics analysis. We divided the differentially expressed proteins (DEPs into two groups: the group consisting of PCa and benign tissues (P&b and the group presenting both high and low PCa metastatic tendencies (H&L. In the P&b group, we found 320 DEPs, 20 of which were reported more than three times, and DES was the most commonly reported. Among these DEPs, the expression levels of FGG, GSN, SERPINC1, TPM1, and TUBB4B have not yet been correlated with PCa. In the H&L group, we identified 353 DEPs, 13 of which were reported more than three times. Among these DEPs, MDH2 and MYH9 have not yet been correlated with PCa metastasis. We further confirmed that DES was differentially expressed between 30 cancer and 30 benign tissues. In addition, DEPs associated with protein transport, regulation of actin cytoskeleton, and the extracellular matrix (ECM–receptor interaction pathway were prevalent in the H&L group and have not yet been studied in detail in this context. Proteins related to homeostasis, the wound-healing response, focal adhesions, and the complement and coagulation pathways were overrepresented in both groups. Our findings suggest that the repeatedly reported DEPs in the two groups may function as potential biomarkers for detecting PCa and predicting its aggressiveness. Furthermore

  16. Consensus strategy in genes prioritization and combined bioinformatics analysis for preeclampsia pathogenesis.

    Science.gov (United States)

    Tejera, Eduardo; Cruz-Monteagudo, Maykel; Burgos, Germán; Sánchez, María-Eugenia; Sánchez-Rodríguez, Aminael; Pérez-Castillo, Yunierkis; Borges, Fernanda; Cordeiro, Maria Natália Dias Soeiro; Paz-Y-Miño, César; Rebelo, Irene

    2017-08-08

    Preeclampsia is a multifactorial disease with unknown pathogenesis. Even when recent studies explored this disease using several bioinformatics tools, the main objective was not directed to pathogenesis. Additionally, consensus prioritization was proved to be highly efficient in the recognition of genes-disease association. However, not information is available about the consensus ability to early recognize genes directly involved in pathogenesis. Therefore our aim in this study is to apply several theoretical approaches to explore preeclampsia; specifically those genes directly involved in the pathogenesis. We firstly evaluated the consensus between 12 prioritization strategies to early recognize pathogenic genes related to preeclampsia. A communality analysis in the protein-protein interaction network of previously selected genes was done including further enrichment analysis. The enrichment analysis includes metabolic pathways as well as gene ontology. Microarray data was also collected and used in order to confirm our results or as a strategy to weight the previously enriched pathways. The consensus prioritized gene list was rationally filtered to 476 genes using several criteria. The communality analysis showed an enrichment of communities connected with VEGF-signaling pathway. This pathway is also enriched considering the microarray data. Our result point to VEGF, FLT1 and KDR as relevant pathogenic genes, as well as those connected with NO metabolism. Our results revealed that consensus strategy improve the detection and initial enrichment of pathogenic genes, at least in preeclampsia condition. Moreover the combination of the first percent of the prioritized genes with protein-protein interaction network followed by communality analysis reduces the gene space. This approach actually identifies well known genes related with pathogenesis. However, genes like HSP90, PAK2, CD247 and others included in the first 1% of the prioritized list need to be further

  17. mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    Science.gov (United States)

    Bokulich, Nicholas A; Rideout, Jai Ram; Mercurio, William G; Shiffer, Arron; Wolfe, Benjamin; Maurice, Corinne F; Dutton, Rachel J; Turnbaugh, Peter J; Knight, Rob; Caporaso, J Gregory

    2016-01-01

    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community.

  18. [BIOINFORMATIC SEARCH AND PHYLOGENETIC ANALYSIS OF THE CELLULOSE SYNTHASE GENES OF FLAX (LINUM USITATISSIMUM)].

    Science.gov (United States)

    Pydiura, N A; Bayer, G Ya; Galinousky, D V; Yemets, A I; Pirko, Ya V; Podvitski, T A; Anisimova, N V; Khotyleva, L V; Kilchevsky, A V; Blume, Ya B

    2015-01-01

    A bioinformatic search of sequences encoding cellulose synthase genes in the flax genome, and their comparison to dicots orthologs was carried out. The analysis revealed 32 cellulose synthase gene candidates, 16 of which are highly likely to encode cellulose synthases, and the remaining 16--cellulose synthase-like proteins (Csl). Phylogenetic analysis of gene products of cellulose synthase genes allowed distinguishing 6 groups of cellulose synthase genes of different classes: CesA1/10, CesA3, CesA4, CesA5/6/2/9, CesA7 and CesA8. Paralogous sequences within classes CesA1/10 and CesA5/6/2/9 which are associated with the primary cell wall formation are characterized by a greater similarity within these classes than orthologous sequences. Whereas the genes controlling the biosynthesis of secondary cell wall cellulose form distinct clades: CesA4, CesA7, and CesA8. The analysis of 16 identified flax cellulose synthase gene candidates shows the presence of at least 12 different cellulose synthase gene variants in flax genome which are represented in all six clades of cellulose synthase genes. Thus, at this point genes of all ten known cellulose synthase classes are identify in flax genome, but their correct classification requires additional research.

  19. Bioconductor: open software development for computational biology and bioinformatics

    DEFF Research Database (Denmark)

    Gentleman, R.C.; Carey, V.J.; Bates, D.M.

    2004-01-01

    The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisci......The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry...... into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples....

  20. Cluster Flow: A user-friendly bioinformatics workflow tool [version 1; referees: 3 approved

    Directory of Open Access Journals (Sweden)

    Philip Ewels

    2016-12-01

    Full Text Available Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

  1. Peer Mentoring for Bioinformatics presentation

    OpenAIRE

    Budd, Aidan

    2014-01-01

    A handout used in a HUB (Heidelberg Unseminars in Bioinformatics) meeting focused on career development for bioinformaticians. It describes an activity for use to help introduce the idea of peer mentoring, potnetially acting as an opportunity to create peer-mentoring groups.

  2. PubData: search engine for bioinformatics databases worldwide

    OpenAIRE

    Vand, Kasra; Wahlestedt, Thor; Khomtchouk, Kelly; Sayed, Mohammed; Wahlestedt, Claes; Khomtchouk, Bohdan

    2016-01-01

    We propose a search engine and file retrieval system for all bioinformatics databases worldwide. PubData searches biomedical data in a user-friendly fashion similar to how PubMed searches biomedical literature. PubData is built on novel network programming, natural language processing, and artificial intelligence algorithms that can patch into the file transfer protocol servers of any user-specified bioinformatics database, query its contents, retrieve files for download, and adapt to the use...

  3. A Bioinformatics Analysis Reveals a Group of MocR Bacterial Transcriptional Regulators Linked to a Family of Genes Coding for Membrane Proteins

    Directory of Open Access Journals (Sweden)

    Teresa Milano

    2016-01-01

    Full Text Available The MocR bacterial transcriptional regulators are characterized by an N-terminal domain, 60 residues long on average, possessing the winged-helix-turn-helix (wHTH architecture responsible for DNA recognition and binding, linked to a large C-terminal domain (350 residues on average that is homologous to fold type-I pyridoxal 5′-phosphate (PLP dependent enzymes like aspartate aminotransferase (AAT. These regulators are involved in the expression of genes taking part in several metabolic pathways directly or indirectly connected to PLP chemistry, many of which are still uncharacterized. A bioinformatics analysis is here reported that studied the features of a distinct group of MocR regulators predicted to be functionally linked to a family of homologous genes coding for integral membrane proteins of unknown function. This group occurs mainly in the Actinobacteria and Gammaproteobacteria phyla. An analysis of the multiple sequence alignments of their wHTH and AAT domains suggested the presence of specificity-determining positions (SDPs. Mapping of SDPs onto a homology model of the AAT domain hinted at possible structural/functional roles in effector recognition. Likewise, SDPs in wHTH domain suggested the basis of specificity of Transcription Factor Binding Site recognition. The results reported represent a framework for rational design of experiments and for bioinformatics analysis of other MocR subgroups.

  4. Real analysis a comprehensive course in analysis, part 1

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 1 is devoted to real analysis. From one point of view, it presents the infinitesimal calculus of the twentieth century with the ultimate integral calculus (measure theory)

  5. Bioinformatic and Comparative Localization of Rab Proteins Reveals Functional Insights into the Uncharacterized GTPases Ypt10p and Ypt11p†

    OpenAIRE

    Buvelot Frei, Stéphanie; Rahl, Peter B.; Nussbaum, Maria; Briggs, Benjamin J.; Calero, Monica; Janeczko, Stephanie; Regan, Andrew D.; Chen, Catherine Z.; Barral, Yves; Whittaker, Gary R.; Collins, Ruth N.

    2006-01-01

    A striking characteristic of a Rab protein is its steady-state localization to the cytosolic surface of a particular subcellular membrane. In this study, we have undertaken a combined bioinformatic and experimental approach to examine the evolutionary conservation of Rab protein localization. A comprehensive primary sequence classification shows that 10 out of the 11 Rab proteins identified in the yeast (Saccharomyces cerevisiae) genome can be grouped within a major subclass, each comprising ...

  6. Assessment of Data Reliability of Wireless Sensor Network for Bioinformatics

    Directory of Open Access Journals (Sweden)

    Ting Dong

    2017-09-01

    Full Text Available As a focal point of biotechnology, bioinformatics integrates knowledge from biology, mathematics, physics, chemistry, computer science and information science. It generally deals with genome informatics, protein structure and drug design. However, the data or information thus acquired from the main areas of bioinformatics may not be effective. Some researchers combined bioinformatics with wireless sensor network (WSN into biosensor and other tools, and applied them to such areas as fermentation, environmental monitoring, food engineering, clinical medicine and military. In the combination, the WSN is used to collect data and information. The reliability of the WSN in bioinformatics is the prerequisite to effective utilization of information. It is greatly influenced by factors like quality, benefits, service, timeliness and stability, some of them are qualitative and some are quantitative. Hence, it is necessary to develop a method that can handle both qualitative and quantitative assessment of information. A viable option is the fuzzy linguistic method, especially 2-tuple linguistic model, which has been extensively used to cope with such issues. As a result, this paper introduces 2-tuple linguistic representation to assist experts in giving their opinions on different WSNs in bioinformatics that involve multiple factors. Moreover, the author proposes a novel way to determine attribute weights and uses the method to weigh the relative importance of different influencing factors which can be considered as attributes in the assessment of the WSN in bioinformatics. Finally, an illustrative example is given to provide a reasonable solution for the assessment.

  7. The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows

    NARCIS (Netherlands)

    Katayama, T.; Arakawa, K.; Nakao, M.; Prins, J.C.P.

    2010-01-01

    Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However,

  8. Reproducible Bioinformatics Research for Biologists

    Science.gov (United States)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  9. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    Bioinformatics is an emerging scientific discipline that uses information ... complex biological questions. ... and computer programs for various purposes of primer ..... polymerase chain reaction: Human Immunodeficiency Virus 1 model studies.

  10. PayDIBI: Pay-as-you-go data integration for bioinformatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Background: Scientific research in bio-informatics is often data-driven and supported by biolog- ical databases. In a growing number of research projects, researchers like to ask questions that require the combination of information from more than one database. Most bio-informatics papers do not

  11. Integration of Proteomics, Bioinformatics, and Systems Biology in Traumatic Brain Injury Biomarker Discovery

    Science.gov (United States)

    Guingab-Cagmat, J.D.; Cagmat, E.B.; Hayes, R.L.; Anagli, J.

    2013-01-01

    Traumatic brain injury (TBI) is a major medical crisis without any FDA-approved pharmacological therapies that have been demonstrated to improve functional outcomes. It has been argued that discovery of disease-relevant biomarkers might help to guide successful clinical trials for TBI. Major advances in mass spectrometry (MS) have revolutionized the field of proteomic biomarker discovery and facilitated the identification of several candidate markers that are being further evaluated for their efficacy as TBI biomarkers. However, several hurdles have to be overcome even during the discovery phase which is only the first step in the long process of biomarker development. The high-throughput nature of MS-based proteomic experiments generates a massive amount of mass spectral data presenting great challenges in downstream interpretation. Currently, different bioinformatics platforms are available for functional analysis and data mining of MS-generated proteomic data. These tools provide a way to convert data sets to biologically interpretable results and functional outcomes. A strategy that has promise in advancing biomarker development involves the triad of proteomics, bioinformatics, and systems biology. In this review, a brief overview of how bioinformatics and systems biology tools analyze, transform, and interpret complex MS datasets into biologically relevant results is discussed. In addition, challenges and limitations of proteomics, bioinformatics, and systems biology in TBI biomarker discovery are presented. A brief survey of researches that utilized these three overlapping disciplines in TBI biomarker discovery is also presented. Finally, examples of TBI biomarkers and their applications are discussed. PMID:23750150

  12. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    Science.gov (United States)

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  13. Taking Bioinformatics to Systems Medicine

    NARCIS (Netherlands)

    van Kampen, Antoine H. C.; Moerland, Perry D.

    2016-01-01

    Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically

  14. Bioinformatics Education in Pathology Training: Current Scope and Future Direction

    Directory of Open Access Journals (Sweden)

    Michael R Clay

    2017-04-01

    Full Text Available Training anatomic and clinical pathology residents in the principles of bioinformatics is a challenging endeavor. Most residents receive little to no formal exposure to bioinformatics during medical education, and most of the pathology training is spent interpreting histopathology slides using light microscopy or focused on laboratory regulation, management, and interpretation of discrete laboratory data. At a minimum, residents should be familiar with data structure, data pipelines, data manipulation, and data regulations within clinical laboratories. Fellowship-level training should incorporate advanced principles unique to each subspecialty. Barriers to bioinformatics education include the clinical apprenticeship training model, ill-defined educational milestones, inadequate faculty expertise, and limited exposure during medical training. Online educational resources, case-based learning, and incorporation into molecular genomics education could serve as effective educational strategies. Overall, pathology bioinformatics training can be incorporated into pathology resident curricula, provided there is motivation to incorporate, institutional support, educational resources, and adequate faculty expertise.

  15. Best practices in bioinformatics training for life scientists

    DEFF Research Database (Denmark)

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik

    2013-01-01

    their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource-centric demos, using interactivity, problem-solving exercises and cooperative learning to substantially enhance training quality and learning outcomes...... to environmental researchers, a common theme is the need not just to use, and gain familiarity with, bioinformatics tools and resources but also to understand their underlying fundamental theoretical and practical concepts. Providing bioinformatics training to empower life scientists to handle and analyse...

  16. Teaching Bioinformatics and Neuroinformatics by Using Free Web-Based Tools

    Science.gov (United States)

    Grisham, William; Schottler, Natalie A.; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson

    2010-01-01

    This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with…

  17. Bioinformatics meets user-centred design: a perspective.

    Directory of Open Access Journals (Sweden)

    Katrina Pavelin

    Full Text Available Designers have a saying that "the joy of an early release lasts but a short time. The bitterness of an unusable system lasts for years." It is indeed disappointing to discover that your data resources are not being used to their full potential. Not only have you invested your time, effort, and research grant on the project, but you may face costly redesigns if you want to improve the system later. This scenario would be less likely if the product was designed to provide users with exactly what they need, so that it is fit for purpose before its launch. We work at EMBL-European Bioinformatics Institute (EMBL-EBI, and we consult extensively with life science researchers to find out what they need from biological data resources. We have found that although users believe that the bioinformatics community is providing accurate and valuable data, they often find the interfaces to these resources tricky to use and navigate. We believe that if you can find out what your users want even before you create the first mock-up of a system, the final product will provide a better user experience. This would encourage more people to use the resource and they would have greater access to the data, which could ultimately lead to more scientific discoveries. In this paper, we explore the need for a user-centred design (UCD strategy when designing bioinformatics resources and illustrate this with examples from our work at EMBL-EBI. Our aim is to introduce the reader to how selected UCD techniques may be successfully applied to software design for bioinformatics.

  18. [Expression profiles and bioinformatic analysis of miRNA in human dental pulp cells during endothelial differentiation].

    Science.gov (United States)

    Gong, Qimei; Jiang, Hongwei; Wang, Jinming; Ling, Junqi

    2014-05-01

    To investigate the differential expression profile and bioinformatic analysis of microRNA (miRNA) in human dental pulp cells (DPC) during endothelial differentiation. DPC were cultured in endothelial induction medium (50 µg/L vascular endothelial growth factor, 10 µg/L basic fibroblast growth factor and 2% fetal calf serum) for 7 days. Meanwhile non-induced DPC were used as control.Quantitative real-time PCR (qRT-PCR) was applied to detect vascular endothelial marker genes [CD31, von Willebrand factor (vWF) and vascular endothelial-cadherin (VE-cadherin)] and in vitro tube formation on matrigel was used to analyze the angiogenic ability of differentiated cells. And then miRNA expression profiles of DPC were examined using miRNA microarray and then the differentially expressed miRNA were validated by qRT-PCR. Furthermore, bioinformatic analysis was employed to predict the target genes of miRNA and to analyze the possible biological functions and signaling pathways that were involved in DPC after induction. The relative mRNA level of CD31, vWF and VE-cadherin in the control group were (3.48 ± 0.22) ×10(-4), (3.13 ± 0.31) ×10(-4) and (39.60 ± 2.36) ×10(-4), and (19.57 ± 2.20) ×10(-4), (48.13 ± 0.54) ×10(-4) and (228.00 ± 8.89) ×10(-4) in the induced group. The expressions of CD31, vWF and VE-cadherin were increased significantly in endothelial induced DPC compared to the control group (P functions, such as the regulation of transcription, cell motion, blood vessel morphogenesis, angiogenesis and cytoskeletal protein, and signaling pathways including the mitogen-activated protein kinase (MAPK) and the Wnt signaling pathway. The differential miRNA expression identified in this study may be involved in governing DPC endothelial differentiation, thus contributing to the future research on regulatory mechanisms in dental pulp angiogenesis.

  19. Bioinformatics on the Cloud Computing Platform Azure

    Science.gov (United States)

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  20. The Comprehension Problems for Second-Language Learners with Poor Reading Comprehension Despite Adequate Decoding: A Meta-Analysis

    Science.gov (United States)

    Spencer, Mercedes; Wagner, Richard K.

    2017-01-01

    We conducted a meta-analysis of 16 existing studies to examine the nature of the comprehension problems for children who were second-language learners with poor reading comprehension despite adequate decoding. Results indicated that these children had deficits in oral language (d = -0.80), but these deficits were not as severe as their reading…

  1. Harmonic analysis a comprehensive course in analysis, part 3

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 3 returns to the themes of Part 1 by discussing pointwise limits (going beyond the usual focus on the Hardy-Littlewood maximal function by including ergodic theorems and m

  2. Databases and Associated Bioinformatic Tools in Studies of Food Allergens, Epitopes and Haptens – a Review

    Directory of Open Access Journals (Sweden)

    Bucholska Justyna

    2018-06-01

    Full Text Available Allergies and/or food intolerances are a growing problem of the modern world. Diffi culties associated with the correct diagnosis of food allergies result in the need to classify the factors causing allergies and allergens themselves. Therefore, internet databases and other bioinformatic tools play a special role in deepening knowledge of biologically-important compounds. Internet repositories, as a source of information on different chemical compounds, including those related to allergy and intolerance, are increasingly being used by scientists. Bioinformatic methods play a signifi cant role in biological and medical sciences, and their importance in food science is increasing. This study aimed at presenting selected databases and tools of bioinformatic analysis useful in research on food allergies, allergens (11 databases, epitopes (7 databases, and haptens (2 databases. It also presents examples of the application of computer methods in studies related to allergies.

  3. BioShaDock: a community driven bioinformatics shared Docker-based tools registry.

    Science.gov (United States)

    Moreews, François; Sallou, Olivier; Ménager, Hervé; Le Bras, Yvan; Monjeaud, Cyril; Blanchet, Christophe; Collin, Olivier

    2015-01-01

    Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  4. Influenza research database: an integrated bioinformatics resource for influenza virus research

    Science.gov (United States)

    The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics, an...

  5. BioStar: an online question & answer resource for the bioinformatics community

    Science.gov (United States)

    Although the era of big data has produced many bioinformatics tools and databases, using them effectively often requires specialized knowledge. Many groups lack bioinformatics expertise, and frequently find that software documentation is inadequate and local colleagues may be overburdened or unfamil...

  6. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    Science.gov (United States)

    Ju, Feng; Zhang, Tong

    2015-11-03

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  7. Bioinformatics analyses of Shigella CRISPR structure and spacer classification.

    Science.gov (United States)

    Wang, Pengfei; Zhang, Bing; Duan, Guangcai; Wang, Yingfang; Hong, Lijuan; Wang, Linlin; Guo, Xiangjiao; Xi, Yuanlin; Yang, Haiyan

    2016-03-01

    Clustered regularly interspaced short palindromic repeats (CRISPR) are inheritable genetic elements of a variety of archaea and bacteria and indicative of the bacterial ecological adaptation, conferring acquired immunity against invading foreign nucleic acids. Shigella is an important pathogen for anthroponosis. This study aimed to analyze the features of Shigella CRISPR structure and classify the spacers through bioinformatics approach. Among 107 Shigella, 434 CRISPR structure loci were identified with two to seven loci in different strains. CRISPR-Q1, CRISPR-Q4 and CRISPR-Q5 were widely distributed in Shigella strains. Comparison of the first and last repeats of CRISPR1, CRISPR2 and CRISPR3 revealed several base variants and different stem-loop structures. A total of 259 cas genes were found among these 107 Shigella strains. The cas gene deletions were discovered in 88 strains. However, there is one strain that does not contain cas gene. Intact clusters of cas genes were found in 19 strains. From comprehensive analysis of sequence signature and BLAST and CRISPRTarget score, the 708 spacers were classified into three subtypes: Type I, Type II and Type III. Of them, Type I spacer referred to those linked with one gene segment, Type II spacer linked with two or more different gene segments, and Type III spacer undefined. This study examined the diversity of CRISPR/cas system in Shigella strains, demonstrated the main features of CRISPR structure and spacer classification, which provided critical information for elucidation of the mechanisms of spacer formation and exploration of the role the spacers play in the function of the CRISPR/cas system.

  8. Bioinformatic analysis of phage AB3, a phiKMV-like virus infecting Acinetobacter baumannii.

    Science.gov (United States)

    Zhang, J; Liu, X; Li, X-J

    2015-01-16

    The phages of Acinetobacter baumannii has drawn increasing attention because of the multi-drug resistance of A. baumanni. The aim of this study was to sequence Acinetobacter baumannii phage AB3 and conduct bioinformatic analysis to lay a foundation for genome remodeling and phage therapy. We isolated and sequenced A. baumannii phage AB3 and attempted to annotate and analyze its genome. The results showed that the genome is a double-stranded DNA with a total length of 31,185 base pairs (bp) and 97 open reading frames greater than 100 bp. The genome includes 28 predicted genes, of which 24 are homologous to phage AB1. The entire coding sequence is located on the negative strand, representing 90.8% of the total length. The G+C mol% was 39.18%, without areas of high G+C content over 200 bp in length. No GC island, tRNA gene, or repeated sequence was identified. Gene lengths were 120-3099 bp, with an average of 1011 bp. Six genes were found to be greater than 2000 bp in length. Genomic alignment and phylogenetic analysis of the RNA polymerase gene showed that similar to phage AB1, phage AB3 is a phiKMV-like virus in the T7 phage family.

  9. A Portable Bioinformatics Course for Upper-Division Undergraduate Curriculum in Sciences

    Science.gov (United States)

    Floraino, Wely B.

    2008-01-01

    This article discusses the challenges that bioinformatics education is facing and describes a bioinformatics course that is successfully taught at the California State Polytechnic University, Pomona, to the fourth year undergraduate students in biological sciences, chemistry, and computer science. Information on lecture and computer practice…

  10. Computer Programming and Biomolecular Structure Studies: A Step beyond Internet Bioinformatics

    Science.gov (United States)

    Likic, Vladimir A.

    2006-01-01

    This article describes the experience of teaching structural bioinformatics to third year undergraduate students in a subject titled "Biomolecular Structure and Bioinformatics." Students were introduced to computer programming and used this knowledge in a practical application as an alternative to the well established Internet bioinformatics…

  11. BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines.

    Science.gov (United States)

    Hernández, Yözen; Bernstein, Rocky; Pagan, Pedro; Vargas, Levy; McCaig, William; Ramrattan, Girish; Akther, Saymon; Larracuente, Amanda; Di, Lia; Vieira, Filipe G; Qiu, Wei-Gang

    2018-03-02

    Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. Command-line utilities further benefit by relieving bioinformatics developers to learn the use of, or to interact directly with, biological software libraries. There is however a lack of command-line utilities that leverage popular Open Source biological software toolkits such as BioPerl ( http://bioperl.org ) to make many of the well-designed, robust, and routinely used biological classes available for a wider base of end users. Designed as standard utilities for UNIX-family operating systems, BpWrapper makes functionality of some of the most popular BioPerl modules readily accessible on the command line to novice as well as to experienced bioinformatics practitioners. The initial release of BpWrapper includes four utilities with concise command-line user interfaces, bioseq, bioaln, biotree, and biopop, specialized for manipulation of molecular sequences, sequence alignments, phylogenetic trees, and DNA polymorphisms, respectively. Over a hundred methods are currently available as command-line options and new methods are easily incorporated. Performance of BpWrapper utilities lags that of precompiled utilities while equivalent to that of other utilities based on BioPerl. BpWrapper has been tested on BioPerl Release 1.6, Perl versions 5.10.1 to 5.25.10, and operating systems including Apple macOS, Microsoft Windows, and GNU/Linux. Release code is available from the Comprehensive Perl Archive Network (CPAN) at https://metacpan.org/pod/Bio::BPWrapper . Source code is available on GitHub at https://github.com/bioperl/p5-bpwrapper . BpWrapper improves on existing sequence utilities by following the design principles of Unix text utilities such including a concise user interface, extensive command-line options, and standard input/output for serialized operations. Further, dozens of novel methods for

  12. Single-Cell Transcriptomics Bioinformatics and Computational Challenges

    Directory of Open Access Journals (Sweden)

    Lana Garmire

    2016-09-01

    Full Text Available The emerging single-cell RNA-Seq (scRNA-Seq technology holds the promise to revolutionize our understanding of diseases and associated biological processes at an unprecedented resolution. It opens the door to reveal the intercellular heterogeneity and has been employed to a variety of applications, ranging from characterizing cancer cells subpopulations to elucidating tumor resistance mechanisms. Parallel to improving experimental protocols to deal with technological issues, deriving new analytical methods to reveal the complexity in scRNA-Seq data is just as challenging. Here we review the current state-of-the-art bioinformatics tools and methods for scRNA-Seq analysis, as well as addressing some critical analytical challenges that the field faces.

  13. Bioinformatic evaluation of L-arginine catabolic pathways in 24 cyanobacteria and transcriptional analysis of genes encoding enzymes of L-arginine catabolism in the cyanobacterium Synechocystis sp. PCC 6803

    Directory of Open Access Journals (Sweden)

    Pistorius Elfriede K

    2007-11-01

    Full Text Available Abstract Background So far very limited knowledge exists on L-arginine catabolism in cyanobacteria, although six major L-arginine-degrading pathways have been described for prokaryotes. Thus, we have performed a bioinformatic analysis of possible L-arginine-degrading pathways in cyanobacteria. Further, we chose Synechocystis sp. PCC 6803 for a more detailed bioinformatic analysis and for validation of the bioinformatic predictions on L-arginine catabolism with a transcript analysis. Results We have evaluated 24 cyanobacterial genomes of freshwater or marine strains for the presence of putative L-arginine-degrading enzymes. We identified an L-arginine decarboxylase pathway in all 24 strains. In addition, cyanobacteria have one or two further pathways representing either an arginase pathway or L-arginine deiminase pathway or an L-arginine oxidase/dehydrogenase pathway. An L-arginine amidinotransferase pathway as a major L-arginine-degrading pathway is not likely but can not be entirely excluded. A rather unusual finding was that the cyanobacterial L-arginine deiminases are substantially larger than the enzymes in non-photosynthetic bacteria and that they are membrane-bound. A more detailed bioinformatic analysis of Synechocystis sp. PCC 6803 revealed that three different L-arginine-degrading pathways may in principle be functional in this cyanobacterium. These are (i an L-arginine decarboxylase pathway, (ii an L-arginine deiminase pathway, and (iii an L-arginine oxidase/dehydrogenase pathway. A transcript analysis of cells grown either with nitrate or L-arginine as sole N-source and with an illumination of 50 μmol photons m-2 s-1 showed that the transcripts for the first enzyme(s of all three pathways were present, but that the transcript levels for the L-arginine deiminase and the L-arginine oxidase/dehydrogenase were substantially higher than that of the three isoenzymes of L-arginine decarboxylase. Conclusion The evaluation of 24

  14. Evaluating Dynamic Analysis Techniques for Program Comprehension

    NARCIS (Netherlands)

    Cornelissen, S.G.M.

    2009-01-01

    Program comprehension is an essential part of software development and software maintenance, as software must be sufficiently understood before it can be properly modified. One of the common approaches in getting to understand a program is the study of its execution, also known as dynamic analysis.

  15. Integrative Bioinformatic Analysis of Transcriptomic Data Identifies Conserved Molecular Pathways Underlying Ionizing Radiation-Induced Bystander Effects (RIBE

    Directory of Open Access Journals (Sweden)

    Constantinos Yeles

    2017-11-01

    Full Text Available Ionizing radiation-induced bystander effects (RIBE encompass a number of effects with potential for a plethora of damages in adjacent non-irradiated tissue. The cascade of molecular events is initiated in response to the exposure to ionizing radiation (IR, something that may occur during diagnostic or therapeutic medical applications. In order to better investigate these complex response mechanisms, we employed a unified framework integrating statistical microarray analysis, signal normalization, and translational bioinformatics functional analysis techniques. This approach was applied to several microarray datasets from Gene Expression Omnibus (GEO related to RIBE. The analysis produced lists of differentially expressed genes, contrasting bystander and irradiated samples versus sham-irradiated controls. Furthermore, comparative molecular analysis through BioInfoMiner, which integrates advanced statistical enrichment and prioritization methodologies, revealed discrete biological processes, at the cellular level. For example, the negative regulation of growth, cellular response to Zn2+-Cd2+, and Wnt and NIK/NF-kappaB signaling, thus refining the description of the phenotypic landscape of RIBE. Our results provide a more solid understanding of RIBE cell-specific response patterns, especially in the case of high-LET radiations, like α-particles and carbon-ions.

  16. Bioinformatics analysis of RNA-seq data revealed critical genes in colon adenocarcinoma.

    Science.gov (United States)

    Xi, W-D; Liu, Y-J; Sun, X-B; Shan, J; Yi, L; Zhang, T-T

    2017-07-01

    RNA-seq data of colon adenocarcinoma (COAD) were analyzed with bioinformatics tools to discover critical genes in the disease. Relevant small molecule drugs, transcription factors (TFs) and microRNAs (miRNAs) were also investigated. RNA-seq data of COAD were downloaded from The Cancer Genome Atlas (TCGA). Differential analysis was performed with package edgeR. False positive discovery (FDR) 1 were set as the cut-offs to screen out differentially expressed genes (DEGs). Gene coexpression network was constructed with package Ebcoexpress. GO enrichment analysis was performed for the DEGs in the gene coexpression network with DAVID. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was also performed for the genes with KOBASS 2.0. Modules were identified with MCODE of Cytoscape. Relevant small molecules drugs were predicted by Connectivity map. Relevant miRNAs and TFs were searched by WebGestalt. A total of 457 DEGs, including 255 up-regulated and 202 down-regulated genes, were identified from 437 COAD and 39 control samples. A gene coexpression network was constructed containing 40 DEGs and 101 edges. The genes were mainly associated with collagen fibril organization, extracellular matrix organization and translation. Two modules were identified from the gene coexpression network, which were implicated in muscle contraction and extracellular matrix organization, respectively. Several critical genes were disclosed, such as MYH11, COL5A2 and ribosomal proteins. Nine relevant small molecule drugs were identified, such as scriptaid and STOCK1N-35874. Accordingly, a total of 17 TFs and 10 miRNAs related to COAD were acquired, such as ETS2, NFAT, AP4, miR-124A, MiR-9, miR-96 and let-7. Several critical genes and relevant drugs, TFs and miRNAs were revealed in COAD. These findings could advance the understanding of the disease and benefit therapy development.

  17. Chapter 16: text mining for translational bioinformatics.

    Science.gov (United States)

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  18. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.

    Science.gov (United States)

    Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel

    2015-06-02

    Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.

  19. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    Science.gov (United States)

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  20. BioSmalltalk: a pure object system and library for bioinformatics.

    Science.gov (United States)

    Morales, Hernán F; Giovambattista, Guillermo

    2013-09-15

    We have developed BioSmalltalk, a new environment system for pure object-oriented bioinformatics programming. Adaptive end-user programming systems tend to become more important for discovering biological knowledge, as is demonstrated by the emergence of open-source programming toolkits for bioinformatics in the past years. Our software is intended to bridge the gap between bioscientists and rapid software prototyping while preserving the possibility of scaling to whole-system biology applications. BioSmalltalk performs better in terms of execution time and memory usage than Biopython and BioPerl for some classical situations. BioSmalltalk is cross-platform and freely available (MIT license) through the Google Project Hosting at http://code.google.com/p/biosmalltalk hernan.morales@gmail.com Supplementary data are available at Bioinformatics online.

  1. Bioclipse: an open source workbench for chemo- and bioinformatics

    Directory of Open Access Journals (Sweden)

    Wagener Johannes

    2007-02-01

    Full Text Available Abstract Background There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality. Existing open source projects are more focused on providing a framework for integrating existing, separately installed bioinformatics packages, rather than providing user-friendly interfaces. No open source chemoinformatics workbench has previously been published, and no sucessful attempts have been made to integrate chemo- and bioinformatics into a single framework. Results Bioclipse is an advanced workbench for resources in chemo- and bioinformatics, such as molecules, proteins, sequences, spectra, and scripts. It provides 2D-editing, 3D-visualization, file format conversion, calculation of chemical properties, and much more; all fully integrated into a user-friendly desktop application. Editing supports standard functions such as cut and paste, drag and drop, and undo/redo. Bioclipse is written in Java and based on the Eclipse Rich Client Platform with a state-of-the-art plugin architecture. This gives Bioclipse an advantage over other systems as it can easily be extended with functionality in any desired direction. Conclusion Bioclipse is a powerful workbench for bio- and chemoinformatics as well as an advanced integration platform. The rich functionality, intuitive user interface, and powerful plugin architecture make Bioclipse the most advanced and user-friendly open source workbench for chemo- and bioinformatics. Bioclipse is released under Eclipse Public License (EPL, an open source license which sets no constraints on external plugin licensing; it is totally open for both open source plugins as well as commercial ones. Bioclipse is freely available at http://www.bioclipse.net.

  2. A generally applicable lightweight method for calculating a value structure for tools and services in bioinformatics infrastructure projects.

    Science.gov (United States)

    Mayer, Gerhard; Quast, Christian; Felden, Janine; Lange, Matthias; Prinz, Manuel; Pühler, Alfred; Lawerenz, Chris; Scholz, Uwe; Glöckner, Frank Oliver; Müller, Wolfgang; Marcus, Katrin; Eisenacher, Martin

    2017-10-30

    Sustainable noncommercial bioinformatics infrastructures are a prerequisite to use and take advantage of the potential of big data analysis for research and economy. Consequently, funders, universities and institutes as well as users ask for a transparent value model for the tools and services offered. In this article, a generally applicable lightweight method is described by which bioinformatics infrastructure projects can estimate the value of tools and services offered without determining exactly the total costs of ownership. Five representative scenarios for value estimation from a rough estimation to a detailed breakdown of costs are presented. To account for the diversity in bioinformatics applications and services, the notion of service-specific 'service provision units' is introduced together with the factors influencing them and the main underlying assumptions for these 'value influencing factors'. Special attention is given on how to handle personnel costs and indirect costs such as electricity. Four examples are presented for the calculation of the value of tools and services provided by the German Network for Bioinformatics Infrastructure (de.NBI): one for tool usage, one for (Web-based) database analyses, one for consulting services and one for bioinformatics training events. Finally, from the discussed values, the costs of direct funding and the costs of payment of services by funded projects are calculated and compared. © The Author 2017. Published by Oxford University Press.

  3. Using Bioinformatics to Develop and Test Hypotheses: E. coli-Specific Virulence Determinants

    Directory of Open Access Journals (Sweden)

    Joanna R. Klein

    2012-09-01

    Full Text Available Bioinformatics, the use of computer resources to understand biological information, is an important tool in research, and can be easily integrated into the curriculum of undergraduate courses. Such an example is provided in this series of four activities that introduces students to the field of bioinformatics as they design PCR based tests for pathogenic E. coli strains. A variety of computer tools are used including BLAST searches at NCBI, bacterial genome searches at the Integrated Microbial Genomes (IMG database, protein analysis at Pfam and literature research at PubMed. In the process, students also learn about virulence factors, enzyme function and horizontal gene transfer. Some or all of the four activities can be incorporated into microbiology or general biology courses taken by students at a variety of levels, ranging from high school through college. The activities build on one another as they teach and reinforce knowledge and skills, promote critical thinking, and provide for student collaboration and presentation. The computer-based activities can be done either in class or outside of class, thus are appropriate for inclusion in online or blended learning formats. Assessment data showed that students learned general microbiology concepts related to pathogenesis and enzyme function, gained skills in using tools of bioinformatics and molecular biology, and successfully developed and tested a scientific hypothesis.

  4. Comprehensive adaptive mesh refinement in wrinkling prediction analysis

    NARCIS (Netherlands)

    Selman, A.; Meinders, Vincent T.; Huetink, Han; van den Boogaard, Antonius H.

    2002-01-01

    Discretisation errors indicator, contact free wrinkling and wrinkling with contact indicators are, in a challenging task, brought together and used in a comprehensive approach to wrinkling prediction analysis in thin sheet metal forming processes.

  5. Towards understanding the lifespan extension by reduced insulin signaling: bioinformatics analysis of DAF-16/FOXO direct targets in Caenorhabditis elegans.

    Science.gov (United States)

    Li, Yan-Hui; Zhang, Gai-Gai

    2016-04-12

    DAF-16, the C. elegans FOXO transcription factor, is an important determinant in aging and longevity. In this work, we manually curated FOXODB http://lyh.pkmu.cn/foxodb/, a database of FOXO direct targets. It now covers 208 genes. Bioinformatics analysis on 109 DAF-16 direct targets in C. elegans found interesting results. (i) DAF-16 and transcription factor PQM-1 co-regulate some targets. (ii) Seventeen targets directly regulate lifespan. (iii) Four targets are involved in lifespan extension induced by dietary restriction. And (iv) DAF-16 direct targets might play global roles in lifespan regulation.

  6. Bioinformatics in New Generation Flavivirus Vaccines

    Directory of Open Access Journals (Sweden)

    Penelope Koraka

    2010-01-01

    Full Text Available Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed.

  7. [Gene cloning and bioinformatics analysis of SABATH methyltransferase in Lonicera japonica var. chinensis].

    Science.gov (United States)

    Yu, Xiao-Dan; Jiang, Chao; Huang, Lu-Qi; Qin, Shuang-Shuang; Zeng, Xiang-Mei; Chen, Ping; Yuan, Yuan

    2013-08-01

    To clone SABATH methyltransferase (rLjSABATHMT) gene in Lonicera japonica var. chinensis, and compare the gene expression and intron sequence of SABATH methyltransferase orthologous in L. japonica with L. japonica var. chinensis. It provide a basis for gene regulate the formation of L. japonica floral scents. The cDNA and genome sequences of LjSABATHMT from L. japonica var. chinensis were cloned according to the gene fragments in cDNA library. The LjSABATHMT protein was characterized by bioinformatics analysis. SABATH family phylogenetic tree were built by MEGA 5.0. The transcripted level of SABATHMT orthologous were analyzed in different organs and different flower periods of L. japonica and L. japonica var. chinensis using RT-PCR analysis. Intron sequences of SABATHMT orthologous were also analyzied. The cDNA of LjSABATHMT was 1 251 bp, had a complete coding frame with 365 amino acids. The protein had the conservative SABATHMT domain, and phylogenetic tree showed that it may be a salicylic acid/benzoic acid methyltransferase. Higher expression of SABATH methyltransferase orthologous was found in flower. The intron sequence of L. japonica and L. japonica var. chinensis had rich polymorphism, and two SNP are unique genotype of L. japonica var. chinensis. The motif elements in two orthologous genes were significant differences. The intron difference of SABATH methyltransferase orthologous could be inducing to difference of gene expression between L. japonica and L. japonica var. chinensis. These results will provide important base on regulating active compounds of L. japonica.

  8. Basic complex analysis a comprehensive course in analysis, part 2a

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 2A is devoted to basic complex analysis. It interweaves three analytic threads associated with Cauchy, Riemann, and Weierstrass, respectively. Cauchy's view focuses on th

  9. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Science.gov (United States)

    Fristensky, Brian

    2007-01-01

    Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere. PMID:17291351

  10. BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Directory of Open Access Journals (Sweden)

    Fristensky Brian

    2007-02-01

    Full Text Available Abstract Background Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. Results BIRCH (Biological Research Computing Hierarchy is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. Conclusion BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

  11. Bioinformatic tools for PCR Primer design

    African Journals Online (AJOL)

    ES

    reaction (PCR), oligo hybridization and DNA sequencing. Proper primer design is actually one of the most important factors/steps in successful DNA sequencing. Various bioinformatics programs are available for selection of primer pairs from a template sequence. The plethora programs for PCR primer design reflects the.

  12. BATMAN-TCM: a Bioinformatics Analysis Tool for Molecular mechANism of Traditional Chinese Medicine

    Science.gov (United States)

    Liu, Zhongyang; Guo, Feifei; Wang, Yong; Li, Chun; Zhang, Xinlei; Li, Honglei; Diao, Lihong; Gu, Jiangyong; Wang, Wei; Li, Dong; He, Fuchu

    2016-02-01

    Traditional Chinese Medicine (TCM), with a history of thousands of years of clinical practice, is gaining more and more attention and application worldwide. And TCM-based new drug development, especially for the treatment of complex diseases is promising. However, owing to the TCM’s diverse ingredients and their complex interaction with human body, it is still quite difficult to uncover its molecular mechanism, which greatly hinders the TCM modernization and internationalization. Here we developed the first online Bioinformatics Analysis Tool for Molecular mechANism of TCM (BATMAN-TCM). Its main functions include 1) TCM ingredients’ target prediction; 2) functional analyses of targets including biological pathway, Gene Ontology functional term and disease enrichment analyses; 3) the visualization of ingredient-target-pathway/disease association network and KEGG biological pathway with highlighted targets; 4) comparison analysis of multiple TCMs. Finally, we applied BATMAN-TCM to Qishen Yiqi dripping Pill (QSYQ) and combined with subsequent experimental validation to reveal the functions of renin-angiotensin system responsible for QSYQ’s cardioprotective effects for the first time. BATMAN-TCM will contribute to the understanding of the “multi-component, multi-target and multi-pathway” combinational therapeutic mechanism of TCM, and provide valuable clues for subsequent experimental validation, accelerating the elucidation of TCM’s molecular mechanism. BATMAN-TCM is available at http://bionet.ncpsb.org/batman-tcm.

  13. Introductory Bioinformatics Exercises Utilizing Hemoglobin and Chymotrypsin to Reinforce the Protein Sequence-Structure-Function Relationship

    Science.gov (United States)

    Inlow, Jennifer K.; Miller, Paige; Pittman, Bethany

    2007-01-01

    We describe two bioinformatics exercises intended for use in a computer laboratory setting in an upper-level undergraduate biochemistry course. To introduce students to bioinformatics, the exercises incorporate several commonly used bioinformatics tools, including BLAST, that are freely available online. The exercises build upon the students'…

  14. OralCard: a bioinformatic tool for the study of oral proteome.

    Science.gov (United States)

    Arrais, Joel P; Rosa, Nuno; Melo, José; Coelho, Edgar D; Amaral, Diana; Correia, Maria José; Barros, Marlene; Oliveira, José Luís

    2013-07-01

    The molecular complexity of the human oral cavity can only be clarified through identification of components that participate within it. However current proteomic techniques produce high volumes of information that are dispersed over several online databases. Collecting all of this data and using an integrative approach capable of identifying unknown associations is still an unsolved problem. This is the main motivation for this work. We present the online bioinformatic tool OralCard, which comprises results from 55 manually curated articles reflecting the oral molecular ecosystem (OralPhysiOme). It comprises experimental information available from the oral proteome both of human (OralOme) and microbial origin (MicroOralOme) structured in protein, disease and organism. This tool is a key resource for researchers to understand the molecular foundations implicated in biology and disease mechanisms of the oral cavity. The usefulness of this tool is illustrated with the analysis of the oral proteome associated with diabetes melitus type 2. OralCard is available at http://bioinformatics.ua.pt/oralcard. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Identification of key genes and molecular mechanisms associated with dedifferentiated liposarcoma based on bioinformatic methods

    Directory of Open Access Journals (Sweden)

    Yu H

    2017-06-01

    . Furthermore, the dysregulated PPI network of DDLPS was constructed, and 14 hub genes were identified. Characteristic of DDLPS, the genes CDK4 and MDM2 were universally found to be up-regulated and amplified in gene copy number.Conclusion: This study used bioinformatics to comprehensively mine DDLPS microarray data in order to obtain a deeper understanding of the molecular mechanism of DDLPS. Keywords: dedifferentiated liposarcoma, molecular mechanisms, microarray, bioinformatic methods

  16. Using miscue analysis to assess comprehension in deaf college readers.

    Science.gov (United States)

    Albertini, John; Mayer, Connie

    2011-01-01

    For over 30 years, teachers have used miscue analysis as a tool to assess and evaluate the reading abilities of hearing students in elementary and middle schools and to design effective literacy programs. More recently, teachers of deaf and hard-of-hearing students have also reported its usefulness for diagnosing word- and phrase-level reading difficulties and for planning instruction. To our knowledge, miscue analysis has not been used with older, college-age deaf students who might also be having difficulty decoding and understanding text at the word level. The goal of this study was to determine whether such an analysis would be helpful in identifying the source of college students' reading comprehension difficulties. After analyzing the miscues of 10 college-age readers and the results of other comprehension-related tasks, we concluded that comprehension of basic grade school-level passages depended on the ability to recognize and comprehend key words and phrases in these texts. We also concluded that these diagnostic procedures provided useful information about the reading abilities and strategies of each reader that had implications for designing more effective interventions.

  17. Advance in structural bioinformatics

    CERN Document Server

    Wei, Dongqing; Zhao, Tangzhen; Dai, Hao

    2014-01-01

    This text examines in detail mathematical and physical modeling, computational methods and systems for obtaining and analyzing biological structures, using pioneering research cases as examples. As such, it emphasizes programming and problem-solving skills. It provides information on structure bioinformatics at various levels, with individual chapters covering introductory to advanced aspects, from fundamental methods and guidelines on acquiring and analyzing genomics and proteomics sequences, the structures of protein, DNA and RNA, to the basics of physical simulations and methods for conform

  18. Toward a comprehensive and systematic methylome signature in colorectal cancers.

    Science.gov (United States)

    Ashktorab, Hassan; Rahi, Hamed; Wansley, Daniel; Varma, Sudhir; Shokrani, Babak; Lee, Edward; Daremipouran, Mohammad; Laiyemo, Adeyinka; Goel, Ajay; Carethers, John M; Brim, Hassan

    2013-08-01

    CpG Island Methylator Phenotype (CIMP) is one of the underlying mechanisms in colorectal cancer (CRC). This study aimed to define a methylome signature in CRC through a methylation microarray analysis and a compilation of promising CIMP markers from the literature. Illumina HumanMethylation27 (IHM27) array data was generated and analyzed based on statistical differences in methylation data (1st approach) or based on overall differences in methylation percentages using lower 95% CI (2nd approach). Pyrosequencing was performed for the validation of nine genes. A meta-analysis was used to identify CIMP and non-CIMP markers that were hypermethylated in CRC but did not yet make it to the CIMP genes' list. Our 1st approach for array data analysis demonstrated the limitations in selecting genes for further validation, highlighting the need for the 2nd bioinformatics approach to adequately select genes with differential aberrant methylation. A more comprehensive list, which included non-CIMP genes, such as APC, EVL, CD109, PTEN, TWIST1, DCC, PTPRD, SFRP1, ICAM5, RASSF1A, EYA4, 30ST2, LAMA1, KCNQ5, ADHEF1, and TFPI2, was established. Array data are useful to categorize and cluster colonic lesions based on their global methylation profiles; however, its usefulness in identifying robust methylation markers is limited and rely on the data analysis method. We have identified 16 non-CIMP-panel genes for which we provide rationale for inclusion in a more comprehensive characterization of CIMP+ CRCs. The identification of a definitive list for methylome specific genes in CRC will contribute to better clinical management of CRC patients.

  19. Bioinformatics in the Netherlands: the value of a nationwide community.

    Science.gov (United States)

    van Gelder, Celia W G; Hooft, Rob W W; van Rijswijk, Merlijn N; van den Berg, Linda; Kok, Ruben G; Reinders, Marcel; Mons, Barend; Heringa, Jaap

    2017-09-15

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures supporting a relatively large Dutch bioinformatics community will be reviewed. We will show that the most valuable resource that we have built over these years is the close-knit national expert community that is well engaged in basic and translational life science research programmes. The Dutch bioinformatics community is accustomed to facing the ever-changing landscape of data challenges and working towards solutions together. In addition, this community is the stable factor on the road towards sustainability, especially in times where existing funding models are challenged and change rapidly. © The Author 2017. Published by Oxford University Press.

  20. What is bioinformatics? A proposed definition and overview of the field.

    Science.gov (United States)

    Luscombe, N M; Greenbaum, D; Gerstein, M

    2001-01-01

    The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

  1. "Extreme Programming" in a Bioinformatics Class

    Science.gov (United States)

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP).…

  2. Bioinformatics in Undergraduate Education: Practical Examples

    Science.gov (United States)

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  3. Bioinformatic approaches reveal metagenomic characterization of soil microbial community.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available As is well known, soil is a complex ecosystem harboring the most prokaryotic biodiversity on the Earth. In recent years, the advent of high-throughput sequencing techniques has greatly facilitated the progress of soil ecological studies. However, how to effectively understand the underlying biological features of large-scale sequencing data is a new challenge. In the present study, we used 33 publicly available metagenomes from diverse soil sites (i.e. grassland, forest soil, desert, Arctic soil, and mangrove sediment and integrated some state-of-the-art computational tools to explore the phylogenetic and functional characterizations of the microbial communities in soil. Microbial composition and metabolic potential in soils were comprehensively illustrated at the metagenomic level. A spectrum of metagenomic biomarkers containing 46 taxa and 33 metabolic modules were detected to be significantly differential that could be used as indicators to distinguish at least one of five soil communities. The co-occurrence associations between complex microbial compositions and functions were inferred by network-based approaches. Our results together with the established bioinformatic pipelines should provide a foundation for future research into the relation between soil biodiversity and ecosystem function.

  4. How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications.

    Science.gov (United States)

    Khomtchouk, Bohdan B; Weitz, Edmund; Karp, Peter D; Wahlestedt, Claes

    2016-12-31

    We present a rationale for expanding the presence of the Lisp family of programming languages in bioinformatics and computational biology research. Put simply, Lisp-family languages enable programmers to more quickly write programs that run faster than in other languages. Languages such as Common Lisp, Scheme and Clojure facilitate the creation of powerful and flexible software that is required for complex and rapidly evolving domains like biology. We will point out several important key features that distinguish languages of the Lisp family from other programming languages, and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSLs): languages that are specialized to a particular area, and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the 'programmable programming language'. We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and artificial intelligence research in bioinformatics and computational biology. © The Author 2016. Published by Oxford University Press.

  5. The Bioinformatics of Integrative Medical Insights: Proposals for an International Psycho-Social and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International Psycho-Social and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  6. The Bioinformatics of Integrative Medical Insights: Proposals for an International PsychoSocial and Cultural Bioinformatics Project

    Directory of Open Access Journals (Sweden)

    Ernest Rossi

    2006-01-01

    Full Text Available We propose the formation of an International PsychoSocial and Cultural Bioinformatics Project (IPCBP to explore the research foundations of Integrative Medical Insights (IMI on all levels from the molecular-genomic to the psychological, cultural, social, and spiritual. Just as The Human Genome Project identified the molecular foundations of modern medicine with the new technology of sequencing DNA during the past decade, the IPCBP would extend and integrate this neuroscience knowledge base with the technology of gene expression via DNA/proteomic microarray research and brain imaging in development, stress, healing, rehabilitation, and the psychotherapeutic facilitation of existentional wellness. We anticipate that the IPCBP will require a unique international collaboration of, academic institutions, researchers, and clinical practioners for the creation of a new neuroscience of mind-body communication, brain plasticity, memory, learning, and creative processing during optimal experiential states of art, beauty, and truth. We illustrate this emerging integration of bioinformatics with medicine with a videotape of the classical 4-stage creative process in a neuroscience approach to psychotherapy.

  7. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    Science.gov (United States)

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  8. A Summer Program Designed to Educate College Students for Careers in Bioinformatics

    Science.gov (United States)

    Krilowicz, Beverly; Johnston, Wendie; Sharp, Sandra B.; Warter-Perez, Nancy; Momand, Jamil

    2007-01-01

    A summer program was created for undergraduates and graduate students that teaches bioinformatics concepts, offers skills in professional development, and provides research opportunities in academic and industrial institutions. We estimate that 34 of 38 graduates (89%) are in a career trajectory that will use bioinformatics. Evidence from…

  9. Bioinformatics tools for the analysis of NMR metabolomics studies focused on the identification of clinically relevant biomarkers.

    Science.gov (United States)

    Puchades-Carrasco, Leonor; Palomino-Schätzlein, Martina; Pérez-Rambla, Clara; Pineda-Lucena, Antonio

    2016-05-01

    Metabolomics, a systems biology approach focused on the global study of the metabolome, offers a tremendous potential in the analysis of clinical samples. Among other applications, metabolomics enables mapping of biochemical alterations involved in the pathogenesis of diseases, and offers the opportunity to noninvasively identify diagnostic, prognostic and predictive biomarkers that could translate into early therapeutic interventions. Particularly, metabolomics by Nuclear Magnetic Resonance (NMR) has the ability to simultaneously detect and structurally characterize an abundance of metabolic components, even when their identities are unknown. Analysis of the data generated using this experimental approach requires the application of statistical and bioinformatics tools for the correct interpretation of the results. This review focuses on the different steps involved in the metabolomics characterization of biofluids for clinical applications, ranging from the design of the study to the biological interpretation of the results. Particular emphasis is devoted to the specific procedures required for the processing and interpretation of NMR data with a focus on the identification of clinically relevant biomarkers. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  10. Impact of comprehensive two-dimensional gas chromatography with mass spectrometry on food analysis.

    Science.gov (United States)

    Tranchida, Peter Q; Purcaro, Giorgia; Maimone, Mariarosa; Mondello, Luigi

    2016-01-01

    Comprehensive two-dimensional gas chromatography with mass spectrometry has been on the separation-science scene for about 15 years. This three-dimensional method has made a great positive impact on various fields of research, and among these that related to food analysis is certainly at the forefront. The present critical review is based on the use of comprehensive two-dimensional gas chromatography with mass spectrometry in the untargeted (general qualitative profiling and fingerprinting) and targeted analysis of food volatiles; attention is focused not only on its potential in such applications, but also on how recent advances in comprehensive two-dimensional gas chromatography with mass spectrometry will potentially be important for food analysis. Additionally, emphasis is devoted to the many instances in which straightforward gas chromatography with mass spectrometry is a sufficiently-powerful analytical tool. Finally, possible future scenarios in the comprehensive two-dimensional gas chromatography with mass spectrometry food analysis field are discussed. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Bioinformatic identification and characterization of human endothelial cell-restricted genes

    Directory of Open Access Journals (Sweden)

    Keskin Derin B

    2010-05-01

    Full Text Available Abstract Background In this study, we used a systematic bioinformatics analysis approach to elucidate genes that exhibit an endothelial cell (EC restricted expression pattern, and began to define their regulation, tissue distribution, and potential biological role. Results Using a high throughput microarray platform, a primary set of 1,191 transcripts that are enriched in different primary ECs compared to non-ECs was identified (LCB >3, FDR Conclusion The study provides an initial catalogue of EC-restricted genes most of which are ubiquitously expressed in different endothelial cells.

  12. Microbial bioinformatics 2020.

    Science.gov (United States)

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! © 2016 The Author. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  13. LipSpin: A New Bioinformatics Tool for Quantitative 1H NMR Lipid Profiling.

    Science.gov (United States)

    Barrilero, Rubén; Gil, Miriam; Amigó, Núria; Dias, Cintia B; Wood, Lisa G; Garg, Manohar L; Ribalta, Josep; Heras, Mercedes; Vinaixa, Maria; Correig, Xavier

    2018-02-06

    The structural similarity among lipid species and the low sensitivity and spectral resolution of nuclear magnetic resonance (NMR) have traditionally hampered the routine use of 1 H NMR lipid profiling of complex biological samples in metabolomics, which remains mostly manual and lacks freely available bioinformatics tools. However, 1 H NMR lipid profiling provides fast quantitative screening of major lipid classes (fatty acids, glycerolipids, phospholipids, and sterols) and some individual species and has been used in several clinical and nutritional studies, leading to improved risk prediction models. In this Article, we present LipSpin, a free and open-source bioinformatics tool for quantitative 1 H NMR lipid profiling. LipSpin implements a constrained line shape fitting algorithm based on voigt profiles and spectral templates from spectra of lipid standards, which automates the analysis of severely overlapped spectral regions and lipid signals with complex coupling patterns. LipSpin provides the most detailed quantification of fatty acid families and choline phospholipids in serum lipid samples by 1 H NMR to date. Moreover, analytical and clinical results using LipSpin quantifications conform with other techniques commonly used for lipid analysis.

  14. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    Science.gov (United States)

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.

  15. Application of bioinformatics tools and databases in microbial dehalogenation research (a review).

    Science.gov (United States)

    Satpathy, R; Konkimalla, V B; Ratha, J

    2015-01-01

    Microbial dehalogenation is a biochemical process in which the halogenated substances are catalyzed enzymatically in to their non-halogenated form. The microorganisms have a wide range of organohalogen degradation ability both explicit and non-specific in nature. Most of these halogenated organic compounds being pollutants need to be remediated; therefore, the current approaches are to explore the potential of microbes at a molecular level for effective biodegradation of these substances. Several microorganisms with dehalogenation activity have been identified and characterized. In this aspect, the bioinformatics plays a key role to gain deeper knowledge in this field of dehalogenation. To facilitate the data mining, many tools have been developed to annotate these data from databases. Therefore, with the discovery of a microorganism one can predict a gene/protein, sequence analysis, can perform structural modelling, metabolic pathway analysis, biodegradation study and so on. This review highlights various methods of bioinformatics approach that describes the application of various databases and specific tools in the microbial dehalogenation fields with special focus on dehalogenase enzymes. Attempts have also been made to decipher some recent applications of in silico modeling methods that comprise of gene finding, protein modelling, Quantitative Structure Biodegradibility Relationship (QSBR) study and reconstruction of metabolic pathways employed in dehalogenation research area.

  16. KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.

    Science.gov (United States)

    Hastreiter, Maximilian; Jeske, Tim; Hoser, Jonathan; Kluge, Michael; Ahomaa, Kaarin; Friedl, Marie-Sophie; Kopetzky, Sebastian J; Quell, Jan-Dominik; Mewes, H Werner; Küffner, Robert

    2017-05-15

    Analysis of Next Generation Sequencing (NGS) data requires the processing of large datasets by chaining various tools with complex input and output formats. In order to automate data analysis, we propose to standardize NGS tasks into modular workflows. This simplifies reliable handling and processing of NGS data, and corresponding solutions become substantially more reproducible and easier to maintain. Here, we present a documented, linux-based, toolbox of 42 processing modules that are combined to construct workflows facilitating a variety of tasks such as DNAseq and RNAseq analysis. We also describe important technical extensions. The high throughput executor (HTE) helps to increase the reliability and to reduce manual interventions when processing complex datasets. We also provide a dedicated binary manager that assists users in obtaining the modules' executables and keeping them up to date. As basis for this actively developed toolbox we use the workflow management software KNIME. See http://ibisngs.github.io/knime4ngs for nodes and user manual (GPLv3 license). robert.kueffner@helmholtz-muenchen.de. Supplementary data are available at Bioinformatics online.

  17. A Quick Guide for Building a Successful Bioinformatics Community

    Science.gov (United States)

    Budd, Aidan; Corpas, Manuel; Brazas, Michelle D.; Fuller, Jonathan C.; Goecks, Jeremy; Mulder, Nicola J.; Michaut, Magali; Ouellette, B. F. Francis; Pawlik, Aleksandra; Blomberg, Niklas

    2015-01-01

    “Scientific community” refers to a group of people collaborating together on scientific-research-related activities who also share common goals, interests, and values. Such communities play a key role in many bioinformatics activities. Communities may be linked to a specific location or institute, or involve people working at many different institutions and locations. Education and training is typically an important component of these communities, providing a valuable context in which to develop skills and expertise, while also strengthening links and relationships within the community. Scientific communities facilitate: (i) the exchange and development of ideas and expertise; (ii) career development; (iii) coordinated funding activities; (iv) interactions and engagement with professionals from other fields; and (v) other activities beneficial to individual participants, communities, and the scientific field as a whole. It is thus beneficial at many different levels to understand the general features of successful, high-impact bioinformatics communities; how individual participants can contribute to the success of these communities; and the role of education and training within these communities. We present here a quick guide to building and maintaining a successful, high-impact bioinformatics community, along with an overview of the general benefits of participating in such communities. This article grew out of contributions made by organizers, presenters, panelists, and other participants of the ISMB/ECCB 2013 workshop “The ‘How To Guide’ for Establishing a Successful Bioinformatics Network” at the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and the 12th European Conference on Computational Biology (ECCB). PMID:25654371

  18. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    Science.gov (United States)

    Wefer, Stephen H.; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics…

  19. XML schemas for common bioinformatic data types and their application in workflow systems.

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-11-06

    Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data--therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.

  20. XML schemas for common bioinformatic data types and their application in workflow systems

    Science.gov (United States)

    Seibel, Philipp N; Krüger, Jan; Hartmeier, Sven; Schwarzer, Knut; Löwenthal, Kai; Mersch, Henning; Dandekar, Thomas; Giegerich, Robert

    2006-01-01

    Background Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. Results Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at , the BioDOM library can be obtained at . Conclusion The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios. PMID:17087823

  1. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    Science.gov (United States)

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have

  2. Novel C16orf57 mutations in patients with Poikiloderma with Neutropenia: bioinformatic analysis of the protein and predicted effects of all reported mutations

    Directory of Open Access Journals (Sweden)

    Colombo Elisa A

    2012-01-01

    Full Text Available Abstract Background Poikiloderma with Neutropenia (PN is a rare autosomal recessive genodermatosis caused by C16orf57 mutations. To date 17 mutations have been identified in 31 PN patients. Results We characterize six PN patients expanding the clinical phenotype of the syndrome and the mutational repertoire of the gene. We detect the two novel C16orf57 mutations, c.232C>T and c.265+2T>G, as well as the already reported c.179delC, c.531delA and c.693+1G>T mutations. cDNA analysis evidences the presence of aberrant transcripts, and bioinformatic prediction of C16orf57 protein structure gauges the mutations effects on the folded protein chain. Computational analysis of the C16orf57 protein shows two conserved H-X-S/T-X tetrapeptide motifs marking the active site of a two-fold pseudosymmetric structure recalling the 2H phosphoesterase superfamily. Based on this model C16orf57 is likely a 2H-active site enzyme functioning in RNA processing, as a presumptive RNA ligase. According to bioinformatic prediction, all known C16orf57 mutations, including the novel mutations herein described, impair the protein structure by either removing one or both tetrapeptide motifs or by destroying the symmetry of the native folding. Finally, we analyse the geographical distribution of the recurrent mutations that depicts clusters featuring a founder effect. Conclusions In cohorts of patients clinically affected by genodermatoses with overlapping symptoms, the molecular screening of C16orf57 gene seems the proper way to address the correct diagnosis of PN, enabling the syndrome-specific oncosurveillance. The bioinformatic prediction of the C16orf57 protein structure denotes a very basic enzymatic function consistent with a housekeeping function. Detection of aberrant transcripts, also in cells from PN patients carrying early truncated mutations, suggests they might be translatable. Tissue-specific sensitivity to the lack of functionally correct protein accounts for the

  3. Implementing bioinformatic workflows within the bioextract server

    Science.gov (United States)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  4. Bioinformatics in Middle East Program Curricula--A Focus on the Arabian Gulf

    Science.gov (United States)

    Loucif, Samia

    2014-01-01

    The purpose of this paper is to investigate the inclusion of bioinformatics in program curricula in the Middle East, focusing on educational institutions in the Arabian Gulf. Bioinformatics is a multidisciplinary field which has emerged in response to the need for efficient data storage and retrieval, and accurate and fast computational and…

  5. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    Science.gov (United States)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  6. A Survey of Bioinformatics Database and Software Usage through Mining the Literature.

    Directory of Open Access Journals (Sweden)

    Geraint Duck

    Full Text Available Computer-based resources are central to much, if not most, biological and medical research. However, while there is an ever expanding choice of bioinformatics resources to use, described within the biomedical literature, little work to date has provided an evaluation of the full range of availability or levels of usage of database and software resources. Here we use text mining to process the PubMed Central full-text corpus, identifying mentions of databases or software within the scientific literature. We provide an audit of the resources contained within the biomedical literature, and a comparison of their relative usage, both over time and between the sub-disciplines of bioinformatics, biology and medicine. We find that trends in resource usage differs between these domains. The bioinformatics literature emphasises novel resource development, while database and software usage within biology and medicine is more stable and conservative. Many resources are only mentioned in the bioinformatics literature, with a relatively small number making it out into general biology, and fewer still into the medical literature. In addition, many resources are seeing a steady decline in their usage (e.g., BLAST, SWISS-PROT, though some are instead seeing rapid growth (e.g., the GO, R. We find a striking imbalance in resource usage with the top 5% of resource names (133 names accounting for 47% of total usage, and over 70% of resources extracted being only mentioned once each. While these results highlight the dynamic and creative nature of bioinformatics research they raise questions about software reuse, choice and the sharing of bioinformatics practice. Is it acceptable that so many resources are apparently never reused? Finally, our work is a step towards automated extraction of scientific method from text. We make the dataset generated by our study available under the CC0 license here: http://dx.doi.org/10.6084/m9.figshare.1281371.

  7. pocketZebra: a web-server for automated selection and classification of subfamily-specific binding sites by bioinformatic analysis of diverse protein families.

    Science.gov (United States)

    Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas

    2014-07-01

    The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Differential Expression of Proteins Associated with the Hair Follicle Cycle - Proteomics and Bioinformatics Analyses.

    Directory of Open Access Journals (Sweden)

    Lei Wang

    Full Text Available Hair follicle cycling can be divided into the following three stages: anagen, catagen, and telogen. The molecular signals that orchestrate the follicular transition between phases are still unknown. To better understand the detailed protein networks controlling this process, proteomics and bioinformatics analyses were performed to construct comparative protein profiles of mouse skin at specific time points (0, 8, and 20 days. Ninety-five differentially expressed protein spots were identified by MALDI-TOF/TOF as 44 proteins, which were found to change during hair follicle cycle transition. Proteomics analysis revealed that these changes in protein expression are involved in Ca2+-regulated biological processes, migration, and regulation of signal transduction, among other processes. Subsequently, three proteins were selected to validate the reliability of expression patterns using western blotting. Cluster analysis revealed three expression patterns, and each pattern correlated with specific cell processes that occur during the hair cycle. Furthermore, bioinformatics analysis indicated that the differentially expressed proteins impacted multiple biological networks, after which detailed functional analyses were performed. Taken together, the above data may provide insight into the three stages of mouse hair follicle morphogenesis and provide a solid basis for potential therapeutic molecular targets for this hair disease.

  9. Using registries to integrate bioinformatics tools and services into workbench environments

    DEFF Research Database (Denmark)

    Ménager, Hervé; Kalaš, Matúš; Rapacki, Kristoffer

    2016-01-01

    The diversity and complexity of bioinformatics resources presents significant challenges to their localisation, deployment and use, creating a need for reliable systems that address these issues. Meanwhile, users demand increasingly usable and integrated ways to access and analyse data, especially......, a software component that will ease the integration of bioinformatics resources in a workbench environment, using their description provided by the existing ELIXIR Tools and Data Services Registry....

  10. A new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21

    DEFF Research Database (Denmark)

    Marhovic, M.; Svensson, Birte; MacGregor, E. A.

    2005-01-01

    many nonamylolytic proteins have been recognized as possessing sequence segments that exhibit similarities with the experimentally observed CBM20 and CBM21. These facts have stimulated interest in carrying out a rigorous bioinformatics analysis of the two CBM families. The present analysis showed...

  11. Comprehensive School Reform and Achievement: A Meta-Analysis

    Science.gov (United States)

    Borman, Geoffrey D.; Hewes, Gina M.; Overman, Laura T.; Brown, Shelly

    2003-01-01

    This meta-analysis reviews research on the achievement effects of comprehensive school reform (CSR) and summarizes the specific effects of 29 widely implemented models. There are limitations on the overall quantity and quality of the research base, but the overall effects of CSR appear promising. The combined quantity, quality, and statistical…

  12. Comprehensive benefit analysis of regional water resources based on multi-objective evaluation

    Science.gov (United States)

    Chi, Yixia; Xue, Lianqing; Zhang, Hui

    2018-01-01

    The purpose of the water resources comprehensive benefits analysis is to maximize the comprehensive benefits on the aspects of social, economic and ecological environment. Aiming at the defects of the traditional analytic hierarchy process in the evaluation of water resources, it proposed a comprehensive benefit evaluation of social, economic and environmental benefits index from the perspective of water resources comprehensive benefit in the social system, economic system and environmental system; determined the index weight by the improved fuzzy analytic hierarchy process (AHP), calculated the relative index of water resources comprehensive benefit and analyzed the comprehensive benefit of water resources in Xiangshui County by the multi-objective evaluation model. Based on the water resources data in Xiangshui County, 20 main comprehensive benefit assessment factors of 5 districts belonged to Xiangshui County were evaluated. The results showed that the comprehensive benefit of Xiangshui County was 0.7317, meanwhile the social economy has a further development space in the current situation of water resources.

  13. A comparison of common programming languages used in bioinformatics.

    Science.gov (United States)

    Fourment, Mathieu; Gillings, Michael R

    2008-02-05

    The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python. Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found. Source code and additional information are available from http://www.bioinformatics.org/benchmark/. This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.

  14. Bioinformatics process management: information flow via a computational journal

    Directory of Open Access Journals (Sweden)

    Lushington Gerald

    2007-12-01

    Full Text Available Abstract This paper presents the Bioinformatics Computational Journal (BCJ, a framework for conducting and managing computational experiments in bioinformatics and computational biology. These experiments often involve series of computations, data searches, filters, and annotations which can benefit from a structured environment. Systems to manage computational experiments exist, ranging from libraries with standard data models to elaborate schemes to chain together input and output between applications. Yet, although such frameworks are available, their use is not widespread–ad hoc scripts are often required to bind applications together. The BCJ explores another solution to this problem through a computer based environment suitable for on-site use, which builds on the traditional laboratory notebook paradigm. It provides an intuitive, extensible paradigm designed for expressive composition of applications. Extensive features facilitate sharing data, computational methods, and entire experiments. By focusing on the bioinformatics and computational biology domain, the scope of the computational framework was narrowed, permitting us to implement a capable set of features for this domain. This report discusses the features determined critical by our system and other projects, along with design issues. We illustrate the use of our implementation of the BCJ on two domain-specific examples.

  15. [Comparison of gut microbiotal compositional analysis of patients with irritable bowel syndrome through different bioinformatics pipelines].

    Science.gov (United States)

    Zhu, S W; Liu, Z J; Li, M; Zhu, H Q; Duan, L P

    2018-04-18

    To assess whether the same biological conclusion, diagnostic or curative effects regarding microbial composition of irritable bowel syndrome (IBS) patients could be reached through different bioinformatics pipelines, we used two common bioinformatics pipelines (Uparse V2.0 and Mothur V1.39.5)to analyze the same fecal microbial 16S rRNA high-throughput sequencing data. The two pipelines were used to analyze the diversity and richness of fecal microbial 16S rRNA high-throughput sequencing data of 27 samples, including 9 healthy controls (HC group), 9 diarrhea IBS patients before (IBS group) and after Rifaximin treatment (IBS-treatment, IBSt group). Analyses such as microbial diversity, principal co-ordinates analysis (PCoA), nonmetric multidimensional scaling (NMDS) and linear discriminant analysis effect size (LEfSe) were used to find out the microbial differences among HC group vs. IBS group and IBS group vs. IBSt group. (1) Microbial composition comparison of the 27 samples in the two pipelines showed significant variations at both family and genera levels while no significant variations at phylum level; (2) There was no significant difference in the comparison of HC vs. IBS or IBS vs. IBSt (Uparse: HC vs. IBS, F=0.98, P=0.445; IBS vs. IBSt, F=0.47,P=0.926; Mothur: HC vs.IBS, F=0.82, P=0.646; IBS vs. IBSt, F=0.37, P=0.961). The Shannon index was significantly decreased in IBSt; (3) Both workshops distinguished the significantly enriched genera between HC and IBS groups. For example, Nitrosomonas and Paraprevotella increased while Pseudoalteromonadaceae and Anaerotruncus decreased in HC group through Uparse pipeline, nevertheless Roseburia 62 increased while Butyricicoccus and Moraxellaceae decreased in HC group through Mothur pipeline.Only Uparse pipeline could pick out significant genera between IBS and IBSt, such as Pseudobutyricibrio, Clostridiaceae 1 and Clostridiumsensustricto 1. There were taxonomic and phylogenetic diversity differences between the two

  16. Bioinformatics analysis of the predicted polyprenol reductase genes in higher plants

    Science.gov (United States)

    Basyuni, M.; Wati, R.

    2018-03-01

    The present study evaluates the bioinformatics methods to analyze twenty-four predicted polyprenol reductase genes from higher plants on GenBank as well as predicted the structure, composition, similarity, subcellular localization, and phylogenetic. The physicochemical properties of plant polyprenol showed diversity among the observed genes. The percentage of the secondary structure of plant polyprenol genes followed the ratio order of α helix > random coil > extended chain structure. The values of chloroplast but not signal peptide were too low, indicated that few chloroplast transit peptide in plant polyprenol reductase genes. The possibility of the potential transit peptide showed variation among the plant polyprenol reductase, suggested the importance of understanding the variety of peptide components of plant polyprenol genes. To clarify this finding, a phylogenetic tree was drawn. The phylogenetic tree shows several branches in the tree, suggested that plant polyprenol reductase genes grouped into divergent clusters in the tree.

  17. The SIB Swiss Institute of Bioinformatics' resources: focus on curated databases

    OpenAIRE

    Bultet, Lisandra Aguilar; Aguilar Rodriguez, Jose; Ahrens, Christian H; Ahrne, Erik Lennart; Ai, Ni; Aimo, Lucila; Akalin, Altuna; Aleksiev, Tyanko; Alocci, Davide; Altenhoff, Adrian; Alves, Isabel; Ambrosini, Giovanna; Pedone, Pascale Anderle; Angelina, Paolo; Anisimova, Maria

    2016-01-01

    The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) provides world-class bioinformatics databases, software tools, services and training to the international life science community in academia and industry. These solutions allow life scientists to turn the exponentially growing amount of data into knowledge. Here, we provide an overview of SIB's resources and competence areas, with a strong focus on curated databases and SIB's most popular and widely used resources. In particular, SIB'...

  18. Rough-fuzzy pattern recognition applications in bioinformatics and medical imaging

    CERN Document Server

    Maji, Pradipta

    2012-01-01

    Learn how to apply rough-fuzzy computing techniques to solve problems in bioinformatics and medical image processing Emphasizing applications in bioinformatics and medical image processing, this text offers a clear framework that enables readers to take advantage of the latest rough-fuzzy computing techniques to build working pattern recognition models. The authors explain step by step how to integrate rough sets with fuzzy sets in order to best manage the uncertainties in mining large data sets. Chapters are logically organized according to the major phases of pattern recognition systems dev

  19. Prokaryotic Expression of Rice Ospgip1 Gene and Bioinformatic Analysis of Encoded Product

    Directory of Open Access Journals (Sweden)

    Xi-jun CHEN

    2011-12-01

    Full Text Available Using the reference sequences of pgip genes in GenBank, a fragment of 930 bp covering the open reading frame (ORF of rice Ospgip1 (Oryza sativa polygalacturonase-inhibiting protein 1 was amplified. The prokaryotic expression product of the gene inhibited the growth of Rhizoctonia solani, the causal agent of rice sheath blight, and reduced its polygalacturonase activity. Bioinformatic analysis showed that OsPGIP1 is a hydrophobic protein with a molecular weight of 32.8 kDa and an isoelectric point (pI of 7.26. The protein is mainly located in the cell wall of rice, and its signal peptide cleavage site is located between the 17th and 18th amino acids. There are four cysteines in both the N- and C-termini of the deduced protein, which can form three disulfide bonds (between the 56th and 63rd, the 278th and 298th, and the 300th and 308th amino acids. The protein has a typical leucine-rich repeat (LRR domain, and its secondary structure comprises α-helices, β-sheets and irregular coils. Compared with polygalacturonase-inhibiting proteins (PGIPs from other plants, the 7th LRR is absent in OsPGIP1. The nine LRRs could form a cleft that might associate with proteins from pathogenic fungi, such as polygalacturonase.

  20. Best practices in bioinformatics training for life scientists.

    KAUST Repository

    Via, Allegra; Blicher, Thomas; Bongcam-Rudloff, Erik; Brazas, Michelle D; Brooksbank, Cath; Budd, Aidan; De Las Rivas, Javier; Dreyer, Jacqueline; Fernandes, Pedro L; van Gelder, Celia; Jacob, Joachim; Jimenez, Rafael C; Loveland, Jane; Moran, Federico; Mulder, Nicola; Nyrö nen, Tommi; Rother, Kristian; Schneider, Maria Victoria; Attwood, Teresa K

    2013-01-01

    concepts. Providing bioinformatics training to empower life scientists to handle and analyse their data efficiently, and progress their research, is a challenge across the globe. Delivering good training goes beyond traditional lectures and resource

  1. Identification of key genes and pathways associated with neuropathic pain in uninjured dorsal root ganglion by using bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Chen CJ

    2017-11-01

    Full Text Available Chao-Jin Chen,* De-Zhao Liu,* Wei-Feng Yao, Yu Gu, Fei Huang, Zi-Qing Hei, Xiang Li Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, People’s Republic of China *These authors contributed equally to this work Purpose: Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL by using bioinformatic analysis.Materials and methods: The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein–protein interaction (PPI network and module analysis. Real-time polymerase chain reaction (PCR and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model.Results: A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were

  2. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    Science.gov (United States)

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  3. Engaging Students in a Bioinformatics Activity to Introduce Gene Structure and Function

    Directory of Open Access Journals (Sweden)

    Barbara J. May

    2013-02-01

    Full Text Available Bioinformatics spans many fields of biological research and plays a vital role in mining and analyzing data. Therefore, there is an ever-increasing need for students to understand not only what can be learned from this data, but also how to use basic bioinformatics tools.  This activity is designed to provide secondary and undergraduate biology students to a hands-on activity meant to explore and understand gene structure with the use of basic bioinformatic tools.  Students are provided an “unknown” sequence from which they are asked to use a free online gene finder program to identify the gene. Students then predict the putative function of this gene with the use of additional online databases.

  4. Rise and demise of bioinformatics? Promise and progress.

    Directory of Open Access Journals (Sweden)

    Christos A Ouzounis

    Full Text Available The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself.

  5. Meeting review: 2002 O'Reilly Bioinformatics Technology Conference.

    Science.gov (United States)

    Counsell, Damian

    2002-01-01

    At the end of January I travelled to the States to speak at and attend the first O'Reilly Bioinformatics Technology Conference. It was a large, well-organized and diverse meeting with an interesting history. Although the meeting was not a typical academic conference, its style will, I am sure, become more typical of meetings in both biological and computational sciences.Speakers at the event included prominent bioinformatics researchers such as Ewan Birney, Terry Gaasterland and Lincoln Stein; authors and leaders in the open source programming community like Damian Conway and Nat Torkington; and representatives from several publishing companies including the Nature Publishing Group, Current Science Group and the President of O'Reilly himself, Tim O'Reilly. There were presentations, tutorials, debates, quizzes and even a 'jam session' for musical bioinformaticists.

  6. Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity

    Science.gov (United States)

    Widodo

    2015-02-01

    Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another

  7. A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

    Directory of Open Access Journals (Sweden)

    Cieślik Marcin

    2011-02-01

    Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and

  8. Identification of Missense Mutation (I12T in the BSND Gene and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Hina Iqbal

    2011-01-01

    Full Text Available Nonsyndromic hearing loss is a paradigm of genetic heterogeneity with 85 loci and 39 nuclear disease genes reported so far. Mutations of BSND have been shown to cause Bartter syndrome type IV, characterized by significant renal abnormalities and deafness and nonsyndromic nearing loss. We studied a Pakistani consanguineous family. Clinical examinations of affected individuals did not reveal the presence of any associated signs, which are hallmarks of the Bartter syndrome type IV. Linkage analysis identified an area of 18.36 Mb shared by all affected individuals between markers D1S2706 and D1S1596. A maximum two-point LOD score of 2.55 with markers D1S2700 and multipoint LOD score of 3.42 with marker D1S1661 were obtained. BSND mutation, that is, p.I12T, cosegregated in all extant members of our pedigree. BSND mutations can cause nonsyndromic hearing loss, and it is a second report for this mutation. The respected protein, that is, BSND, was first modeled, and then, the identified mutation was further analyzed by using different bioinformatics tools; finally, this protein and its mutant was docked with CLCNKB and REN, interactions of BSND, respectively.

  9. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    Science.gov (United States)

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  10. RISE OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY IN INDIA: A LOOK THROUGH PUBLICATIONS

    Directory of Open Access Journals (Sweden)

    Anjali Srivastava

    2017-09-01

    Full Text Available Computational biology and bioinformatics have been part and parcel of biomedical research for few decades now. However, the institutionalization of bioinformatics research took place with the establishment of Distributed Information Centres (DISCs in the research institutions of repute in various disciplines by the Department of Biotechnology, Government of India. Though, at initial stages, this endeavor was mainly focused on providing infrastructure for using information technology and internet based communication and tools for carrying out computational biology and in-silico assisted research in varied arena of research starting from disease biology to agricultural crops, spices, veterinary science and many more, the natural outcome of establishment of such facilities resulted into new experiments with bioinformatics tools. Thus, Biotechnology Information Systems (BTIS grew into a solid movement and a large number of publications started coming out of these centres. In the end of last century, bioinformatics started developing like a full-fledged research subject. In the last decade, a need was felt to actually make a factual estimation of the result of this endeavor of DBT which had, by then, established about two hundred centres in almost all disciplines of biomedical research. In a bid to evaluate the efforts and outcome of these centres, BTIS Centre at CSIR-CDRI, Lucknow was entrusted with collecting and collating the publications of these centres. However, when the full data was compiled, the DBT task force felt that the study must include Non-BTIS centres also so as to expand the report to have a glimpse of bioinformatics publications from the country.

  11. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology.

    Science.gov (United States)

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-11-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another.

  12. A review of bioinformatics training applied to research in molecular medicine, agriculture and biodiversity in Costa Rica and Central America.

    Science.gov (United States)

    Orozco, Allan; Morera, Jessica; Jiménez, Sergio; Boza, Ricardo

    2013-09-01

    Today, Bioinformatics has become a scientific discipline with great relevance for the Molecular Biosciences and for the Omics sciences in general. Although developed countries have progressed with large strides in Bioinformatics education and research, in other regions, such as Central America, the advances have occurred in a gradual way and with little support from the Academia, either at the undergraduate or graduate level. To address this problem, the University of Costa Rica's Medical School, a regional leader in Bioinformatics in Central America, has been conducting a series of Bioinformatics workshops, seminars and courses, leading to the creation of the region's first Bioinformatics Master's Degree. The recent creation of the Central American Bioinformatics Network (BioCANET), associated to the deployment of a supporting computational infrastructure (HPC Cluster) devoted to provide computing support for Molecular Biology in the region, is providing a foundational stone for the development of Bioinformatics in the area. Central American bioinformaticians have participated in the creation of as well as co-founded the Iberoamerican Bioinformatics Society (SOIBIO). In this article, we review the most recent activities in education and research in Bioinformatics from several regional institutions. These activities have resulted in further advances for Molecular Medicine, Agriculture and Biodiversity research in Costa Rica and the rest of the Central American countries. Finally, we provide summary information on the first Central America Bioinformatics International Congress, as well as the creation of the first Bioinformatics company (Indromics Bioinformatics), spin-off the Academy in Central America and the Caribbean.

  13. Bioinformatics: A History of Evolution "In Silico"

    Science.gov (United States)

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  14. Protein raftophilicity. How bioinformatics can help membranologists

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Sperotto, Maria Maddalena

    )-based bioinformatics approach. The ANN was trained to recognize feature-based patterns in proteins that are considered to be associated with lipid rafts. The trained ANN was then used to predict protein raftophilicity. We found that, in the case of α-helical membrane proteins, their hydrophobic length does not affect...

  15. Development and implementation of a bioinformatics online ...

    African Journals Online (AJOL)

    Thus, there is the need for appropriate strategies of introducing the basic components of this emerging scientific field to part of the African populace through the development of an online distance education learning tool. This study involved the design of a bioinformatics online distance educative tool an implementation of ...

  16. Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem.

    Science.gov (United States)

    Wilson, Justin; Dai, Manhong; Jakupovic, Elvis; Watson, Stanley; Meng, Fan

    2007-01-01

    Modern video cards and game consoles typically have much better performance to price ratios than that of general purpose CPUs. The parallel processing capabilities of game hardware are well-suited for high throughput biomedical data analysis. Our initial results suggest that game hardware is a cost-effective platform for some computationally demanding bioinformatics problems.

  17. Computational biology of genome expression and regulation--a review of microarray bioinformatics.

    Science.gov (United States)

    Wang, Junbai

    2008-01-01

    Microarray technology is being used widely in various biomedical research areas; the corresponding microarray data analysis is an essential step toward the best utilizing of array technologies. Here we review two components of the microarray data analysis: a low level of microarray data analysis that emphasizes the designing, the quality control, and the preprocessing of microarray experiments, then a high level of microarray data analysis that focuses on the domain-specific microarray applications such as tumor classification, biomarker prediction, analyzing array CGH experiments, and reverse engineering of gene expression networks. Additionally, we will review the recent development of building a predictive model in genome expression and regulation studies. This review may help biologists grasp a basic knowledge of microarray bioinformatics as well as its potential impact on the future evolvement of biomedical research fields.

  18. Molecular Cloning, Bioinformatic Analysis, and Expression of Bombyx mori Lebocin 5 Gene Related to Beauveria bassiana Infection.

    Science.gov (United States)

    Lü, Dingding; Hou, Chengxiang; Qin, Guangxing; Gao, Kun; Chen, Tian; Guo, Xijie

    2017-01-01

    A full-length cDNA of lebocin 5 (BmLeb5) was first cloned from silkworm, Bombyx mori , by rapid amplification of cDNA ends. The BmLeb5 gene is 808 bp in length and the open reading frame encodes a 179-amino acid hydroxyproline-rich peptide. Bioinformatic analysis results showed that BmLeb5 owns an O-glycosylation site and four RXXR motifs as other lebocins. Sequence similarity and phylogenic analysis results indicated that lebocins form a multiple gene family in silkworm as cecropins. Quantitative real-time PCR analysis revealed that BmLeb5 was highest expressed in the fat body. In the silkworm larvae infected by Beauveria bassiana , the expression level of BmLeb5 was upregulated in the fat body and hemolymph which are the most important immune tissues in silkworm. The recombinant protein of BmLeb5 was for the first time successfully expressed with prokaryotic expression system and purified. There are no reports so far that the expression of lebocins could be induced by entomopathogenic fungus. Our study suggested that BmLeb5 might play an important role in the immune response of silkworm to defend B. bassiana infection. The results also provided helpful information for further studying the lebocin family functioned in antifungal immune response in the silkworm.

  19. Bioinformatics education in high school: implications for promoting science, technology, engineering, and mathematics careers.

    Science.gov (United States)

    Kovarik, Dina N; Patterson, Davis G; Cohen, Carolyn; Sanders, Elizabeth A; Peterson, Karen A; Porter, Sandra G; Chowning, Jeanne Ting

    2013-01-01

    We investigated the effects of our Bio-ITEST teacher professional development model and bioinformatics curricula on cognitive traits (awareness, engagement, self-efficacy, and relevance) in high school teachers and students that are known to accompany a developing interest in science, technology, engineering, and mathematics (STEM) careers. The program included best practices in adult education and diverse resources to empower teachers to integrate STEM career information into their classrooms. The introductory unit, Using Bioinformatics: Genetic Testing, uses bioinformatics to teach basic concepts in genetics and molecular biology, and the advanced unit, Using Bioinformatics: Genetic Research, utilizes bioinformatics to study evolution and support student research with DNA barcoding. Pre-post surveys demonstrated significant growth (n = 24) among teachers in their preparation to teach the curricula and infuse career awareness into their classes, and these gains were sustained through the end of the academic year. Introductory unit students (n = 289) showed significant gains in awareness, relevance, and self-efficacy. While these students did not show significant gains in engagement, advanced unit students (n = 41) showed gains in all four cognitive areas. Lessons learned during Bio-ITEST are explored in the context of recommendations for other programs that wish to increase student interest in STEM careers.

  20. A BIOINFORMATIC STRATEGY TO RAPIDLY CHARACTERIZE CDNA LIBRARIES

    Science.gov (United States)

    A Bioinformatic Strategy to Rapidly Characterize cDNA LibrariesG. Charles Ostermeier1, David J. Dix2 and Stephen A. Krawetz1.1Departments of Obstetrics and Gynecology, Center for Molecular Medicine and Genetics, & Institute for Scientific Computing, Wayne State Univer...

  1. Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

    Science.gov (United States)

    Zhang, Xiaorong

    2011-01-01

    We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…

  2. PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

    Science.gov (United States)

    Gao, Yubang; Wang, Huiyuan; Zhang, Hangxiao; Wang, Yongsheng; Chen, Jinfeng; Gu, Lianfeng

    2018-05-01

    The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results. The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI. lfgu@fafu.edu.cn.

  3. Functional and shape data analysis

    CERN Document Server

    Srivastava, Anuj

    2016-01-01

    This textbook for courses on function data analysis and shape data analysis describes how to define, compare, and mathematically represent shapes, with a focus on statistical modeling and inference. It is aimed at graduate students in analysis in statistics, engineering, applied mathematics, neuroscience, biology, bioinformatics, and other related areas. The interdisciplinary nature of the broad range of ideas covered—from introductory theory to algorithmic implementations and some statistical case studies—is meant to familiarize graduate students with an array of tools that are relevant in developing computational solutions for shape and related analyses. These tools, gleaned from geometry, algebra, statistics, and computational science, are traditionally scattered across different courses, departments, and disciplines; Functional and Shape Data Analysis offers a unified, comprehensive solution by integrating the registration problem into shape analysis, better preparing graduate students for handling fu...

  4. Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Ju-Suk Nam

    2014-03-01

    Full Text Available Wingless-type (Wnt signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα. Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers.

  5. KBWS: an EMBOSS associated package for accessing bioinformatics web services.

    Science.gov (United States)

    Oshita, Kazuki; Arakawa, Kazuharu; Tomita, Masaru

    2011-04-29

    The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS) UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS), adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded) and http://soap.g-language.org/kbws_dl.wsdl (Document/literal).

  6. KBWS: an EMBOSS associated package for accessing bioinformatics web services

    Directory of Open Access Journals (Sweden)

    Tomita Masaru

    2011-04-01

    Full Text Available Abstract The availability of bioinformatics web-based services is rapidly proliferating, for their interoperability and ease of use. The next challenge is in the integration of these services in the form of workflows, and several projects are already underway, standardizing the syntax, semantics, and user interfaces. In order to deploy the advantages of web services with locally installed tools, here we describe a collection of proxy client tools for 42 major bioinformatics web services in the form of European Molecular Biology Open Software Suite (EMBOSS UNIX command-line tools. EMBOSS provides sophisticated means for discoverability and interoperability for hundreds of tools, and our package, named the Keio Bioinformatics Web Service (KBWS, adds functionalities of local and multiple alignment of sequences, phylogenetic analyses, and prediction of cellular localization of proteins and RNA secondary structures. This software implemented in C is available under GPL from http://www.g-language.org/kbws/ and GitHub repository http://github.com/cory-ko/KBWS. Users can utilize the SOAP services implemented in Perl directly via WSDL file at http://soap.g-language.org/kbws.wsdl (RPC Encoded and http://soap.g-language.org/kbws_dl.wsdl (Document/literal.

  7. FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies.

    Science.gov (United States)

    Kim, Jiwoong; Kim, Min Soo; Koh, Andrew Y; Xie, Yang; Zhan, Xiaowei

    2016-10-10

    Given the lack of a complete and comprehensive library of microbial reference genomes, determining the functional profile of diverse microbial communities is challenging. The available functional analysis pipelines lack several key features: (i) an integrated alignment tool, (ii) operon-level analysis, and (iii) the ability to process large datasets. Here we introduce our open-sourced, stand-alone functional analysis pipeline for analyzing whole metagenomic and metatranscriptomic sequencing data, FMAP (Functional Mapping and Analysis Pipeline). FMAP performs alignment, gene family abundance calculations, and statistical analysis (three levels of analyses are provided: differentially-abundant genes, operons and pathways). The resulting output can be easily visualized with heatmaps and functional pathway diagrams. FMAP functional predictions are consistent with currently available functional analysis pipelines. FMAP is a comprehensive tool for providing functional analysis of metagenomic/metatranscriptomic sequencing data. With the added features of integrated alignment, operon-level analysis, and the ability to process large datasets, FMAP will be a valuable addition to the currently available functional analysis toolbox. We believe that this software will be of great value to the wider biology and bioinformatics communities.

  8. Implementing a Web-Based Introductory Bioinformatics Course for Non-Bioinformaticians That Incorporates Practical Exercises

    Science.gov (United States)

    Vincent, Antony T.; Bourbonnais, Yves; Brouard, Jean-Simon; Deveau, Hélène; Droit, Arnaud; Gagné, Stéphane M.; Guertin, Michel; Lemieux, Claude; Rathier, Louis; Charette, Steve J.; Lagüe, Patrick

    2018-01-01

    A recent scientific discipline, bioinformatics, defined as using informatics for the study of biological problems, is now a requirement for the study of biological sciences. Bioinformatics has become such a powerful and popular discipline that several academic institutions have created programs in this field, allowing students to become…

  9. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    Science.gov (United States)

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  10. Are There Gender Differences in Emotion Comprehension? Analysis of the Test of Emotion Comprehension.

    Science.gov (United States)

    Fidalgo, Angel M; Tenenbaum, Harriet R; Aznar, Ana

    2018-01-01

    This article examines whether there are gender differences in understanding the emotions evaluated by the Test of Emotion Comprehension (TEC). The TEC provides a global index of emotion comprehension in children 3-11 years of age, which is the sum of the nine components that constitute emotion comprehension: (1) recognition of facial expressions, (2) understanding of external causes of emotions, (3) understanding of desire-based emotions, (4) understanding of belief-based emotions, (5) understanding of the influence of a reminder on present emotional states, (6) understanding of the possibility to regulate emotional states, (7) understanding of the possibility of hiding emotional states, (8) understanding of mixed emotions, and (9) understanding of moral emotions. We used the answers to the TEC given by 172 English girls and 181 boys from 3 to 8 years of age. First, the nine components into which the TEC is subdivided were analysed for differential item functioning (DIF), taking gender as the grouping variable. To evaluate DIF, the Mantel-Haenszel method and logistic regression analysis were used applying the Educational Testing Service DIF classification criteria. The results show that the TEC did not display gender DIF. Second, when absence of DIF had been corroborated, it was analysed for differences between boys and girls in the total TEC score and its components controlling for age. Our data are compatible with the hypothesis of independence between gender and level of comprehension in 8 of the 9 components of the TEC. Several hypotheses are discussed that could explain the differences found between boys and girls in the belief component. Given that the Belief component is basically a false belief task, the differences found seem to support findings in the literature indicating that girls perform better on this task.

  11. CLIMB (the Cloud Infrastructure for Microbial Bioinformatics): an online resource for the medical microbiology community.

    Science.gov (United States)

    Connor, Thomas R; Loman, Nicholas J; Thompson, Simon; Smith, Andy; Southgate, Joel; Poplawski, Radoslaw; Bull, Matthew J; Richardson, Emily; Ismail, Matthew; Thompson, Simon Elwood-; Kitchen, Christine; Guest, Martyn; Bakke, Marius; Sheppard, Samuel K; Pallen, Mark J

    2016-09-01

    The increasing availability and decreasing cost of high-throughput sequencing has transformed academic medical microbiology, delivering an explosion in available genomes while also driving advances in bioinformatics. However, many microbiologists are unable to exploit the resulting large genomics datasets because they do not have access to relevant computational resources and to an appropriate bioinformatics infrastructure. Here, we present the Cloud Infrastructure for Microbial Bioinformatics (CLIMB) facility, a shared computing infrastructure that has been designed from the ground up to provide an environment where microbiologists can share and reuse methods and data.

  12. Bioboxes: standardised containers for interchangeable bioinformatics software.

    Science.gov (United States)

    Belmann, Peter; Dröge, Johannes; Bremges, Andreas; McHardy, Alice C; Sczyrba, Alexander; Barton, Michael D

    2015-01-01

    Software is now both central and essential to modern biology, yet lack of availability, difficult installations, and complex user interfaces make software hard to obtain and use. Containerisation, as exemplified by the Docker platform, has the potential to solve the problems associated with sharing software. We propose bioboxes: containers with standardised interfaces to make bioinformatics software interchangeable.

  13. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    Science.gov (United States)

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  14. Comparative Proteome Bioinformatics: Identification of Phosphotyrosine Signaling Proteins in the Unicellular Protozoan Ciliate Tetrahymena

    DEFF Research Database (Denmark)

    Gammeltoft, Steen; Christensen, Søren Tvorup; Joachimiak, Marcin

    2005-01-01

    Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH......Tetrahymena, bioinformatics, cilia, evolution, signaling, TtPTK1, PTK, Grb2, SH-PTP 2, Plcy, Src, PTP, PI3K, SH2, SH3, PH...

  15. Polygonal approximation and scale-space analysis of closed digital curves

    CERN Document Server

    Ray, Kumar S

    2013-01-01

    This book covers the most important topics in the area of pattern recognition, object recognition, computer vision, robot vision, medical computing, computational geometry, and bioinformatics systems. Students and researchers will find a comprehensive treatment of polygonal approximation and its real life applications. The book not only explains the theoretical aspects but also presents applications with detailed design parameters. The systematic development of the concept of polygonal approximation of digital curves and its scale-space analysis are useful and attractive to scholars in many fi

  16. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics

    DEFF Research Database (Denmark)

    Kouskoumvekaki, Irene; Shublaq, Nour; Brunak, Søren

    2014-01-01

    As both the amount of generated biological data and the processing compute power increase, computational experimentation is no longer the exclusivity of bioinformaticians, but it is moving across all biomedical domains. For bioinformatics to realize its translational potential, domain experts need...... access to user-friendly solutions to navigate, integrate and extract information out of biological databases, as well as to combine tools and data resources in bioinformatics workflows. In this review, we present services that assist biomedical scientists in incorporating bioinformatics tools...... into their research.We review recent applications of Cytoscape, BioGPS and DAVID for data visualization, integration and functional enrichment. Moreover, we illustrate the use of Taverna, Kepler, GenePattern, and Galaxy as open-access workbenches for bioinformatics workflows. Finally, we mention services...

  17. Novel SINEs families in Medicago truncatula and Lotus japonicus: bioinformatic analysis.

    Science.gov (United States)

    Gadzalski, Marek; Sakowicz, Tomasz

    2011-07-01

    Although short interspersed elements (SINEs) were discovered nearly 30 years ago, the studies of these genomic repeats were mostly limited to animal genomes. Very little is known about SINEs in legumes--one of the most important plant families. Here we report identification, genomic distribution and molecular features of six novel SINE elements in Lotus japonicus (named LJ_SINE-1, -2, -3) and Medicago truncatula (MT_SINE-1, -2, -3), model species of legume. They possess all the structural features commonly found in short interspersed elements including RNA polymerase III promoter, polyA tail and flanking repeats. SINEs described here are present in low to moderate copy numbers from 150 to 3000. Bioinformatic analyses were used to searched public databases, we have shown that three of new SINE elements from M. truncatula seem to be characteristic of Medicago and Trifolium genera. Two SINE families have been found in L. japonicus and one is present in both M. truncatula and L. japonicus. In addition, we are discussing potential activities of the described elements. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. Bioinformatics tools and database resources for systems genetics analysis in mice-a short review and an evaluation of future needs

    NARCIS (Netherlands)

    Durrant, Caroline; Swertz, Morris A.; Alberts, Rudi; Arends, Danny; Moeller, Steffen; Mott, Richard; Prins, Pjotr; van der Velde, K. Joeri; Jansen, Ritsert C.; Schughart, Klaus

    During a meeting of the SYSGENET working group 'Bioinformatics', currently available software tools and databases for systems genetics in mice were reviewed and the needs for future developments discussed. The group evaluated interoperability and performed initial feasibility studies. To aid future

  19. MACBenAbim: A Multi-platform Mobile Application for searching keyterms in Computational Biology and Bioinformatics.

    Science.gov (United States)

    Oluwagbemi, Olugbenga O; Adewumi, Adewole; Esuruoso, Abimbola

    2012-01-01

    Computational biology and bioinformatics are gradually gaining grounds in Africa and other developing nations of the world. However, in these countries, some of the challenges of computational biology and bioinformatics education are inadequate infrastructures, and lack of readily-available complementary and motivational tools to support learning as well as research. This has lowered the morale of many promising undergraduates, postgraduates and researchers from aspiring to undertake future study in these fields. In this paper, we developed and described MACBenAbim (Multi-platform Mobile Application for Computational Biology and Bioinformatics), a flexible user-friendly tool to search for, define and describe the meanings of keyterms in computational biology and bioinformatics, thus expanding the frontiers of knowledge of the users. This tool also has the capability of achieving visualization of results on a mobile multi-platform context. MACBenAbim is available from the authors for non-commercial purposes.

  20. Establishing a distributed national research infrastructure providing bioinformatics support to life science researchers in Australia.

    Science.gov (United States)

    Schneider, Maria Victoria; Griffin, Philippa C; Tyagi, Sonika; Flannery, Madison; Dayalan, Saravanan; Gladman, Simon; Watson-Haigh, Nathan; Bayer, Philipp E; Charleston, Michael; Cooke, Ira; Cook, Rob; Edwards, Richard J; Edwards, David; Gorse, Dominique; McConville, Malcolm; Powell, David; Wilkins, Marc R; Lonie, Andrew

    2017-06-30

    EMBL Australia Bioinformatics Resource (EMBL-ABR) is a developing national research infrastructure, providing bioinformatics resources and support to life science and biomedical researchers in Australia. EMBL-ABR comprises 10 geographically distributed national nodes with one coordinating hub, with current funding provided through Bioplatforms Australia and the University of Melbourne for its initial 2-year development phase. The EMBL-ABR mission is to: (1) increase Australia's capacity in bioinformatics and data sciences; (2) contribute to the development of training in bioinformatics skills; (3) showcase Australian data sets at an international level and (4) enable engagement in international programs. The activities of EMBL-ABR are focussed in six key areas, aligning with comparable international initiatives such as ELIXIR, CyVerse and NIH Commons. These key areas-Tools, Data, Standards, Platforms, Compute and Training-are described in this article. © The Author 2017. Published by Oxford University Press.

  1. Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting.

    Science.gov (United States)

    Ibrahim, Bashar; Arkhipova, Ksenia; Andeweg, Arno C; Posada-Céspedes, Susana; Enault, François; Gruber, Arthur; Koonin, Eugene V; Kupczok, Anne; Lemey, Philippe; McHardy, Alice C; McMahon, Dino P; Pickett, Brett E; Robertson, David L; Scheuermann, Richard H; Zhernakova, Alexandra; Zwart, Mark P; Schönhuth, Alexander; Dutilh, Bas E; Marz, Manja

    2018-05-14

    The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.

  2. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    Science.gov (United States)

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  3. Protecting innovation in bioinformatics and in-silico biology.

    Science.gov (United States)

    Harrison, Robert

    2003-01-01

    Commercial success or failure of innovation in bioinformatics and in-silico biology requires the appropriate use of legal tools for protecting and exploiting intellectual property. These tools include patents, copyrights, trademarks, design rights, and limiting information in the form of 'trade secrets'. Potentially patentable components of bioinformatics programmes include lines of code, algorithms, data content, data structure and user interfaces. In both the US and the European Union, copyright protection is granted for software as a literary work, and most other major industrial countries have adopted similar rules. Nonetheless, the grant of software patents remains controversial and is being challenged in some countries. Current debate extends to aspects such as whether patents can claim not only the apparatus and methods but also the data signals and/or products, such as a CD-ROM, on which the programme is stored. The patentability of substances discovered using in-silico methods is a separate debate that is unlikely to be resolved in the near future.

  4. Revealing and Quantifying the Impaired Phonological Analysis Underpinning Impaired Comprehension in Wernicke's Aphasia

    Science.gov (United States)

    Robson, Holly; Keidel, James L.; Lambon Ralph, Matthew A.; Sage, Karen

    2012-01-01

    Wernicke's aphasia is a condition which results in severely disrupted language comprehension following a lesion to the left temporo-parietal region. A phonological analysis deficit has traditionally been held to be at the root of the comprehension impairment in Wernicke's aphasia, a view consistent with current functional neuroimaging which finds…

  5. Operator theory a comprehensive course in analysis, part 4

    CERN Document Server

    Simon, Barry

    2015-01-01

    A Comprehensive Course in Analysis by Poincaré Prize winner Barry Simon is a five-volume set that can serve as a graduate-level analysis textbook with a lot of additional bonus information, including hundreds of problems and numerous notes that extend the text and provide important historical background. Depth and breadth of exposition make this set a valuable reference source for almost all areas of classical analysis. Part 4 focuses on operator theory, especially on a Hilbert space. Central topics are the spectral theorem, the theory of trace class and Fredholm determinants, and the study of

  6. Pay-as-you-go data integration for bio-informatics

    NARCIS (Netherlands)

    Wanders, B.

    2012-01-01

    Scientific research in bio-informatics is often data-driven and supported by numerous biological databases. A biological database contains factual information collected from scientific experiments and computational analyses about areas including genomics, proteomics, metabolomics, microarray gene

  7. Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N; King, Benjamin L; Polson, Shawn W; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F; Page, Shallee T; Rendino, Marc Farnum; Thomas, William Kelley; Udwary, Daniel W; Wu, Cathy H

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome.

  8. Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees

    Science.gov (United States)

    Wang, Qinghua; Arighi, Cecilia N.; King, Benjamin L.; Polson, Shawn W.; Vincent, James; Chen, Chuming; Huang, Hongzhan; Kingham, Brewster F.; Page, Shallee T.; Farnum Rendino, Marc; Thomas, William Kelley; Udwary, Daniel W.; Wu, Cathy H.

    2012-01-01

    Recent advances in high-throughput DNA sequencing technologies have equipped biologists with a powerful new set of tools for advancing research goals. The resulting flood of sequence data has made it critically important to train the next generation of scientists to handle the inherent bioinformatic challenges. The North East Bioinformatics Collaborative (NEBC) is undertaking the genome sequencing and annotation of the little skate (Leucoraja erinacea) to promote advancement of bioinformatics infrastructure in our region, with an emphasis on practical education to create a critical mass of informatically savvy life scientists. In support of the Little Skate Genome Project, the NEBC members have developed several annotation workshops and jamborees to provide training in genome sequencing, annotation and analysis. Acting as a nexus for both curation activities and dissemination of project data, a project web portal, SkateBase (http://skatebase.org) has been developed. As a case study to illustrate effective coupling of community annotation with workforce development, we report the results of the Mitochondrial Genome Annotation Jamborees organized to annotate the first completely assembled element of the Little Skate Genome Project, as a culminating experience for participants from our three prior annotation workshops. We are applying the physical/virtual infrastructure and lessons learned from these activities to enhance and streamline the genome annotation workflow, as we look toward our continuing efforts for larger-scale functional and structural community annotation of the L. erinacea genome. PMID:22434832

  9. BioShaDock: a community driven bioinformatics shared Docker-based tools registry [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    François Moreews

    2015-12-01

    Full Text Available Linux container technologies, as represented by Docker, provide an alternative to complex and time-consuming installation processes needed for scientific software. The ease of deployment and the process isolation they enable, as well as the reproducibility they permit across environments and versions, are among the qualities that make them interesting candidates for the construction of bioinformatic infrastructures, at any scale from single workstations to high throughput computing architectures. The Docker Hub is a public registry which can be used to distribute bioinformatic software as Docker images. However, its lack of curation and its genericity make it difficult for a bioinformatics user to find the most appropriate images needed. BioShaDock is a bioinformatics-focused Docker registry, which provides a local and fully controlled environment to build and publish bioinformatic software as portable Docker images. It provides a number of improvements over the base Docker registry on authentication and permissions management, that enable its integration in existing bioinformatic infrastructures such as computing platforms. The metadata associated with the registered images are domain-centric, including for instance concepts defined in the EDAM ontology, a shared and structured vocabulary of commonly used terms in bioinformatics. The registry also includes user defined tags to facilitate its discovery, as well as a link to the tool description in the ELIXIR registry if it already exists. If it does not, the BioShaDock registry will synchronize with the registry to create a new description in the Elixir registry, based on the BioShaDock entry metadata. This link will help users get more information on the tool such as its EDAM operations, input and output types. This allows integration with the ELIXIR Tools and Data Services Registry, thus providing the appropriate visibility of such images to the bioinformatics community.

  10. An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

    KAUST Repository

    Bonny, Talal; Salama, Khaled N.; Zidan, Mohammed A.

    2012-01-01

    Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we

  11. 'Students-as-partners' scheme enhances postgraduate students' employability skills while addressing gaps in bioinformatics education.

    Science.gov (United States)

    Mello, Luciane V; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan

    2017-01-01

    Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student-staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators' teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability.

  12. Critical evaluation of the use of bioinformatics as a theoretical tool to find high-potential sources of ACE inhibitory peptides

    NARCIS (Netherlands)

    Vercruysse, L.; Smagghe, G.; Bent, van der A.; Amerongen, van A.; Ongenaert, M.; Camp, van J.

    2009-01-01

    A bioinformatics analysis to screen for high-potential sources of angiotensin converting enzyme (ACE) inhibitory peptides was conducted in the area of insect muscle proteins. Vertebrate muscle proteins are reported as good sources of ACE inhibitory peptides, while the research on invertebrate muscle

  13. A bioinformatics-based overview of protein Lys-Ne-acetylation

    Science.gov (United States)

    Among posttranslational modifications, there are some conceptual similarities between Lys-N'-acetylation and Ser/Thr/Tyr O-phosphorylation. Herein we present a bioinformatics-based overview of reversible protein Lys-acetylation, including some comparisons with reversible protein phosphorylation. T...

  14. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Directory of Open Access Journals (Sweden)

    Jinkyu Kim

    Full Text Available The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  15. Entropy-based analysis and bioinformatics-inspired integration of global economic information transfer.

    Science.gov (United States)

    Kim, Jinkyu; Kim, Gunn; An, Sungbae; Kwon, Young-Kyun; Yoon, Sungroh

    2013-01-01

    The assessment of information transfer in the global economic network helps to understand the current environment and the outlook of an economy. Most approaches on global networks extract information transfer based mainly on a single variable. This paper establishes an entirely new bioinformatics-inspired approach to integrating information transfer derived from multiple variables and develops an international economic network accordingly. In the proposed methodology, we first construct the transfer entropies (TEs) between various intra- and inter-country pairs of economic time series variables, test their significances, and then use a weighted sum approach to aggregate information captured in each TE. Through a simulation study, the new method is shown to deliver better information integration compared to existing integration methods in that it can be applied even when intra-country variables are correlated. Empirical investigation with the real world data reveals that Western countries are more influential in the global economic network and that Japan has become less influential following the Asian currency crisis.

  16. jORCA: easily integrating bioinformatics Web Services.

    Science.gov (United States)

    Martín-Requena, Victoria; Ríos, Javier; García, Maximiliano; Ramírez, Sergio; Trelles, Oswaldo

    2010-02-15

    Web services technology is becoming the option of choice to deploy bioinformatics tools that are universally available. One of the major strengths of this approach is that it supports machine-to-machine interoperability over a network. However, a weakness of this approach is that various Web Services differ in their definition and invocation protocols, as well as their communication and data formats-and this presents a barrier to service interoperability. jORCA is a desktop client aimed at facilitating seamless integration of Web Services. It does so by making a uniform representation of the different web resources, supporting scalable service discovery, and automatic composition of workflows. Usability is at the top of the jORCA agenda; thus it is a highly customizable and extensible application that accommodates a broad range of user skills featuring double-click invocation of services in conjunction with advanced execution-control, on the fly data standardization, extensibility of viewer plug-ins, drag-and-drop editing capabilities, plus a file-based browsing style and organization of favourite tools. The integration of bioinformatics Web Services is made easier to support a wider range of users. .

  17. Missing "Links" in Bioinformatics Education: Expanding Students' Conceptions of Bioinformatics Using a Biodiversity Database of Living and Fossil Reef Corals

    Science.gov (United States)

    Nehm, Ross H.; Budd, Ann F.

    2006-01-01

    NMITA is a reef coral biodiversity database that we use to introduce students to the expansive realm of bioinformatics beyond genetics. We introduce a series of lessons that have students use this database, thereby accessing real data that can be used to test hypotheses about biodiversity and evolution while targeting the "National Science …

  18. Evaluation of a 5-tier scheme proposed for classification of sequence variants using bioinformatic and splicing assay data

    DEFF Research Database (Denmark)

    Walker, Logan C; Whiley, Phillip J; Houdayer, Claude

    2013-01-01

    BRCA1 and 176 BRCA2 unique variants, from 77 publications. At least six independent reviewers from research and/or clinical settings comprehensively examined splicing assay methods and data reported for 22 variant assays of 21 variants in four publications, and classified the variants using the 5-tier......Splicing assays are commonly undertaken in the clinical setting to assess the clinical relevance of sequence variants in disease predisposition genes. A 5-tier classification system incorporating both bioinformatic and splicing assay information was previously proposed as a method to provide...... of results, and the lack of quantitative data for the aberrant transcripts. We propose suggestions for minimum reporting guidelines for splicing assays, and improvements to the 5-tier splicing classification system to allow future evaluation of its performance as a clinical tool....

  19. G2LC: Resources Autoscaling for Real Time Bioinformatics Applications in IaaS

    Directory of Open Access Journals (Sweden)

    Rongdong Hu

    2015-01-01

    Full Text Available Cloud computing has started to change the way how bioinformatics research is being carried out. Researchers who have taken advantage of this technology can process larger amounts of data and speed up scientific discovery. The variability in data volume results in variable computing requirements. Therefore, bioinformatics researchers are pursuing more reliable and efficient methods for conducting sequencing analyses. This paper proposes an automated resource provisioning method, G2LC, for bioinformatics applications in IaaS. It enables application to output the results in a real time manner. Its main purpose is to guarantee applications performance, while improving resource utilization. Real sequence searching data of BLAST is used to evaluate the effectiveness of G2LC. Experimental results show that G2LC guarantees the application performance, while resource is saved up to 20.14%.

  20. Meta-learning framework applied in bioinformatics inference system design.

    Science.gov (United States)

    Arredondo, Tomás; Ormazábal, Wladimir

    2015-01-01

    This paper describes a meta-learner inference system development framework which is applied and tested in the implementation of bioinformatic inference systems. These inference systems are used for the systematic classification of the best candidates for inclusion in bacterial metabolic pathway maps. This meta-learner-based approach utilises a workflow where the user provides feedback with final classification decisions which are stored in conjunction with analysed genetic sequences for periodic inference system training. The inference systems were trained and tested with three different data sets related to the bacterial degradation of aromatic compounds. The analysis of the meta-learner-based framework involved contrasting several different optimisation methods with various different parameters. The obtained inference systems were also contrasted with other standard classification methods with accurate prediction capabilities observed.

  1. Penalized feature selection and classification in bioinformatics

    OpenAIRE

    Ma, Shuangge; Huang, Jian

    2008-01-01

    In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classific...

  2. Bioinformatics-driven identification and examination of candidate genes for non-alcoholic fatty liver disease.

    Directory of Open Access Journals (Sweden)

    Karina Banasik

    2011-01-01

    Full Text Available Candidate genes for non-alcoholic fatty liver disease (NAFLD identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes.By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D, central obesity, and WHO-defined metabolic syndrome (MetS.273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05 to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations.Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.

  3. Bioinformatics analysis identifies several intrinsically disordered human E3 ubiquitin-protein ligases

    Directory of Open Access Journals (Sweden)

    Wouter Boomsma

    2016-02-01

    Full Text Available The ubiquitin-proteasome system targets misfolded proteins for degradation. Since the accumulation of such proteins is potentially harmful for the cell, their prompt removal is important. E3 ubiquitin-protein ligases mediate substrate ubiquitination by bringing together the substrate with an E2 ubiquitin-conjugating enzyme, which transfers ubiquitin to the substrate. For misfolded proteins, substrate recognition is generally delegated to molecular chaperones that subsequently interact with specific E3 ligases. An important exception is San1, a yeast E3 ligase. San1 harbors extensive regions of intrinsic disorder, which provide both conformational flexibility and sites for direct recognition of misfolded targets of vastly different conformations. So far, no mammalian ortholog of San1 is known, nor is it clear whether other E3 ligases utilize disordered regions for substrate recognition. Here, we conduct a bioinformatics analysis to examine >600 human and S. cerevisiae E3 ligases to identify enzymes that are similar to San1 in terms of function and/or mechanism of substrate recognition. An initial sequence-based database search was found to detect candidates primarily based on the homology of their ordered regions, and did not capture the unique disorder patterns that encode the functional mechanism of San1. However, by searching specifically for key features of the San1 sequence, such as long regions of intrinsic disorder embedded with short stretches predicted to be suitable for substrate interaction, we identified several E3 ligases with these characteristics. Our initial analysis revealed that another remarkable trait of San1 is shared with several candidate E3 ligases: long stretches of complete lysine suppression, which in San1 limits auto-ubiquitination. We encode these characteristic features into a San1 similarity-score, and present a set of proteins that are plausible candidates as San1 counterparts in humans. In conclusion, our work

  4. Comprehensive physical analysis of bond wire interfaces in power modules

    DEFF Research Database (Denmark)

    Popok, Vladimir; Pedersen, Kristian Bonderup; Kristensen, Peter Kjær

    2016-01-01

    causing failures. In this paper we present a review on the set of our experimental and theoretical studies allowing comprehensive physical analysis of changes in materials under active power cycling with focus on bond wire interfaces and thin metallisation layers. The developed electro-thermal and thermo...

  5. Comprehensive proteomic analysis of human pancreatic juice

    DEFF Research Database (Denmark)

    Grønborg, Mads; Bunkenborg, Jakob; Kristiansen, Troels Zakarias

    2004-01-01

    Proteomic technologies provide an excellent means for analysis of body fluids for cataloging protein constituents and identifying biomarkers for early detection of cancers. The biomarkers currently available for pancreatic cancer, such as CA19-9, lack adequate sensitivity and specificity...... contributing to late diagnosis of this deadly disease. In this study, we carried out a comprehensive characterization of the "pancreatic juice proteome" in patients with pancreatic adenocarcinoma. Pancreatic juice was first fractionated by 1-dimensional gel electrophoresis and subsequently analyzed by liquid...... in this study could be directly assessed for their potential as biomarkers for pancreatic cancer by quantitative proteomics methods or immunoassays....

  6. Architecture exploration of FPGA based accelerators for bioinformatics applications

    CERN Document Server

    Varma, B Sharat Chandra; Balakrishnan, M

    2016-01-01

    This book presents an evaluation methodology to design future FPGA fabrics incorporating hard embedded blocks (HEBs) to accelerate applications. This methodology will be useful for selection of blocks to be embedded into the fabric and for evaluating the performance gain that can be achieved by such an embedding. The authors illustrate the use of their methodology by studying the impact of HEBs on two important bioinformatics applications: protein docking and genome assembly. The book also explains how the respective HEBs are designed and how hardware implementation of the application is done using these HEBs. It shows that significant speedups can be achieved over pure software implementations by using such FPGA-based accelerators. The methodology presented in this book may also be used for designing HEBs for accelerating software implementations in other domains besides bioinformatics. This book will prove useful to students, researchers, and practicing engineers alike.

  7. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    International Nuclear Information System (INIS)

    Roche-Lima, Abiel; Thulasiram, Ruppa K

    2012-01-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  8. Discovery of putative salivary biomarkers for Sjögren's syndrome using high resolution mass spectrometry and bioinformatics.

    Science.gov (United States)

    Zoukhri, Driss; Rawe, Ian; Singh, Mabi; Brown, Ashley; Kublin, Claire L; Dawson, Kevin; Haddon, William F; White, Earl L; Hanley, Kathleen M; Tusé, Daniel; Malyj, Wasyl; Papas, Athena

    2012-03-01

    The purpose of the current study was to determine if saliva contains biomarkers that can be used as diagnostic tools for Sjögren's syndrome (SjS). Twenty seven SjS patients and 27 age-matched healthy controls were recruited for these studies. Unstimulated glandular saliva was collected from the Wharton's duct using a suction device. Two µl of salvia were processed for mass spectrometry analyses on a prOTOF 2000 matrix-assisted laser desorption/ionization orthogonal time of flight (MALDI O-TOF) mass spectrometer. Raw data were analyzed using bioinformatic tools to identify biomarkers. MALDI O-TOF MS analyses of saliva samples were highly reproducible and the mass spectra generated were very rich in peptides and peptide fragments in the 750-7,500 Da range. Data analysis using bioinformatic tools resulted in several classification models being built and several biomarkers identified. One model based on 7 putative biomarkers yielded a sensitivity of 97.5%, specificity of 97.8% and an accuracy of 97.6%. One biomarker was present only in SjS samples and was identified as a proteolytic peptide originating from human basic salivary proline-rich protein 3 precursor. We conclude that salivary biomarkers detected by high-resolution mass spectrometry coupled with powerful bioinformatic tools offer the potential to serve as diagnostic/prognostic tools for SjS.

  9. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Yuri Kravatsky

    2017-11-01

    Full Text Available The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs, requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s. Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s. The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi targets in human immunodeficiency virus 1 (HIV-1 subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  10. Analysis and Comprehensive Analytical Modeling of Statistical Variations in Subthreshold MOSFET's High Frequency Characteristics

    Directory of Open Access Journals (Sweden)

    Rawid Banchuin

    2014-01-01

    Full Text Available In this research, the analysis of statistical variations in subthreshold MOSFET's high frequency characteristics defined in terms of gate capacitance and transition frequency, have been shown and the resulting comprehensive analytical models of such variations in terms of their variances have been proposed. Major imperfection in the physical level properties including random dopant fluctuation and effects of variations in MOSFET's manufacturing process, have been taken into account in the proposed analysis and modeling. The up to dated comprehensive analytical model of statistical variation in MOSFET's parameter has been used as the basis of analysis and modeling. The resulting models have been found to be both analytic and comprehensive as they are the precise mathematical expressions in terms of physical level variables of MOSFET. Furthermore, they have been verified at the nanometer level by using 65~nm level BSIM4 based benchmarks and have been found to be very accurate with smaller than 5 % average percentages of errors. Hence, the performed analysis gives the resulting models which have been found to be the potential mathematical tool for the statistical and variability aware analysis and design of subthreshold MOSFET based VHF circuits, systems and applications.

  11. Staff Scientist - RNA Bioinformatics | Center for Cancer Research

    Science.gov (United States)

    The newly established RNA Biology Laboratory (RBL) at the Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH) in Frederick, Maryland is recruiting a Staff Scientist with strong expertise in RNA bioinformatics to join the Intramural Research Program’s mission of high impact, high reward science. The RBL is the equivalent of an

  12. ‘Students-as-partners’ scheme enhances postgraduate students’ employability skills while addressing gaps in bioinformatics education

    Science.gov (United States)

    Mello, Luciane V.; Tregilgas, Luke; Cowley, Gwen; Gupta, Anshul; Makki, Fatima; Jhutty, Anjeet; Shanmugasundram, Achchuthan

    2017-01-01

    Abstract Teaching bioinformatics is a longstanding challenge for educators who need to demonstrate to students how skills developed in the classroom may be applied to real world research. This study employed an action research methodology which utilised student–staff partnership and peer-learning. It was centred on the experiences of peer-facilitators, students who had previously taken a postgraduate bioinformatics module, and had applied knowledge and skills gained from it to their own research. It aimed to demonstrate to peer-receivers, current students, how bioinformatics could be used in their own research while developing peer-facilitators’ teaching and mentoring skills. This student-centred approach was well received by the peer-receivers, who claimed to have gained improved understanding of bioinformatics and its relevance to research. Equally, peer-facilitators also developed a better understanding of the subject and appreciated that the activity was a rare and invaluable opportunity to develop their teaching and mentoring skills, enhancing their employability. PMID:29098185

  13. Bioinformatics tools for development of fast and cost effective simple ...

    African Journals Online (AJOL)

    Bioinformatics tools for development of fast and cost effective simple sequence repeat ... comparative mapping and exploration of functional genetic diversity in the ... Already, a number of computer programs have been implemented that aim at ...

  14. Virginia Bioinformatics Institute to expand cyberinfrastructure education and outreach project

    OpenAIRE

    Whyte, Barry James

    2008-01-01

    The National Science Foundation has awarded the Virginia Bioinformatics Institute at Virginia Tech $918,000 to expand its education and outreach program in Cyberinfrastructure - Training, Education, Advancement and Mentoring, commonly known as the CI-TEAM.

  15. L2 Reading Comprehension and Its Correlates: A Meta-Analysis

    Science.gov (United States)

    Jeon, Eun Hee; Yamashita, Junko

    2014-01-01

    The present meta-analysis examined the overall average correlation (weighted for sample size and corrected for measurement error) between passage-level second language (L2) reading comprehension and 10 key reading component variables investigated in the research domain. Four high-evidence correlates (with 18 or more accumulated effect sizes: L2…

  16. LipidPedia: a comprehensive lipid knowledgebase.

    Science.gov (United States)

    Kuo, Tien-Chueh; Tseng, Yufeng Jane

    2018-04-10

    Lipids are divided into fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, saccharolipids, sterols, prenol lipids and polyketides. Fatty acyls and glycerolipids are commonly used as energy storage, whereas glycerophospholipids, sphingolipids, sterols and saccharolipids are common used as components of cell membranes. Lipids in fatty acyls, glycerophospholipids, sphingolipids and sterols classes play important roles in signaling. Although more than 36 million lipids can be identified or computationally generated, no single lipid database provides comprehensive information on lipids. Furthermore, the complex systematic or common names of lipids make the discovery of related information challenging. Here, we present LipidPedia, a comprehensive lipid knowledgebase. The content of this database is derived from integrating annotation data with full-text mining of 3,923 lipids and more than 400,000 annotations of associated diseases, pathways, functions, and locations that are essential for interpreting lipid functions and mechanisms from over 1,400,000 scientific publications. Each lipid in LipidPedia also has its own entry containing a text summary curated from the most frequently cited diseases, pathways, genes, locations, functions, lipids and experimental models in the biomedical literature. LipidPedia aims to provide an overall synopsis of lipids to summarize lipid annotations and provide a detailed listing of references for understanding complex lipid functions and mechanisms. LipidPedia is available at http://lipidpedia.cmdm.tw. yjtseng@csie.ntu.edu.tw. Supplementary data are available at Bioinformatics online.

  17. Bioinformatics Analysis for the Antirheumatic Effects of Huang-Lian-Jie-Du-Tang from a Network Perspective

    Directory of Open Access Journals (Sweden)

    Haiyang Fang

    2013-01-01

    Full Text Available Huang-Lian-Jie-Du-Tang (HLJDT is a classic TCM formula to clear “heat” and “poison” that exhibits antirheumatic activity. Here we investigated the therapeutic mechanisms of HLJDT at protein network level using bioinformatics approach. It was found that HLJDT shares 5 target proteins with 3 types of anti-RA drugs, and several pathways in immune system and bone formation are significantly regulated by HLJDT’s components, suggesting the therapeutic effect of HLJDT on RA. By defining an antirheumatic effect score to quantitatively measure the therapeutic effect, we found that the score of each HLJDT’s component is very low, while the whole HLJDT achieves a much higher effect score, suggesting a synergistic effect of HLJDT achieved by its multiple components acting on multiple targets. At last, topological analysis on the RA-associated PPI network was conducted to illustrate key roles of HLJDT’s target proteins on this network. Integrating our findings with TCM theory suggests that HLJDT targets on hub nodes and main pathway in the Hot ZENG network, and thus it could be applied as adjuvant treatment for Hot-ZENG-related RA. This study may facilitate our understanding of antirheumatic effect of HLJDT and it may suggest new approach for the study of TCM pharmacology.

  18. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    Science.gov (United States)

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  19. Comprehensive two-dimensional liquid chromatographic analysis of poloxamers.

    Science.gov (United States)

    Malik, Muhammad Imran; Lee, Sanghoon; Chang, Taihyun

    2016-04-15

    Poloxamers are low molar mass triblock copolymers of poly(ethylene oxide) (PEO) and poly(propylene oxide) (PPO), having number of applications as non-ionic surfactants. Comprehensive one and two-dimensional liquid chromatographic (LC) analysis of these materials is proposed in this study. The separation of oligomers of both types (PEO and PPO) is demonstrated for several commercial poloxamers. This is accomplished at the critical conditions for one of the block while interaction for the other block. Reversed phase LC at CAP of PEO allowed for oligomeric separation of triblock copolymers with regard to PPO block whereas normal phase LC at CAP of PPO renders oligomeric separation with respect to PEO block. The oligomeric separation with regard to PEO and PPO are coupled online (comprehensive 2D-LC) to reveal two-dimensional contour plots by unconventional 2D IC×IC (interaction chromatography) coupling. The study provides chemical composition mapping of both PEO and PPO, equivalent to combined molar mass and chemical composition mapping for several commercial poloxamers. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    Directory of Open Access Journals (Sweden)

    Zheng Sun

    2014-01-01

    Full Text Available WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1 and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA, two integrin beta (ITGB, and one syndecan (SDC. Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp.

  1. Bioinformatics for Undergraduates: Steps toward a Quantitative Bioscience Curriculum

    Science.gov (United States)

    Chapman, Barbara S.; Christmann, James L.; Thatcher, Eileen F.

    2006-01-01

    We describe an innovative bioinformatics course developed under grants from the National Science Foundation and the California State University Program in Research and Education in Biotechnology for undergraduate biology students. The project has been part of a continuing effort to offer students classroom experiences focused on principles and…

  2. Bioinformatic tools and guideline for PCR primer design | Abd ...

    African Journals Online (AJOL)

    Bioinformatics has become an essential tool not only for basic research but also for applied research in biotechnology and biomedical sciences. Optimal primer sequence and appropriate primer concentration are essential for maximal specificity and efficiency of PCR. A poorly designed primer can result in little or no ...

  3. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser

    Directory of Open Access Journals (Sweden)

    Jonas S Almeida

    2012-01-01

    Full Text Available Background: Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. Materials and Methods: ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Results : Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH′s popular ImageJ application. Conclusions : The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without

  4. ImageJS: Personalized, participated, pervasive, and reproducible image bioinformatics in the web browser.

    Science.gov (United States)

    Almeida, Jonas S; Iriabho, Egiebade E; Gorrepati, Vijaya L; Wilkinson, Sean R; Grüneberg, Alexander; Robbins, David E; Hackney, James R

    2012-01-01

    Image bioinformatics infrastructure typically relies on a combination of server-side high-performance computing and client desktop applications tailored for graphic rendering. On the server side, matrix manipulation environments are often used as the back-end where deployment of specialized analytical workflows takes place. However, neither the server-side nor the client-side desktop solution, by themselves or combined, is conducive to the emergence of open, collaborative, computational ecosystems for image analysis that are both self-sustained and user driven. ImageJS was developed as a browser-based webApp, untethered from a server-side backend, by making use of recent advances in the modern web browser such as a very efficient compiler, high-end graphical rendering capabilities, and I/O tailored for code migration. Multiple versioned code hosting services were used to develop distinct ImageJS modules to illustrate its amenability to collaborative deployment without compromise of reproducibility or provenance. The illustrative examples include modules for image segmentation, feature extraction, and filtering. The deployment of image analysis by code migration is in sharp contrast with the more conventional, heavier, and less safe reliance on data transfer. Accordingly, code and data are loaded into the browser by exactly the same script tag loading mechanism, which offers a number of interesting applications that would be hard to attain with more conventional platforms, such as NIH's popular ImageJ application. The modern web browser was found to be advantageous for image bioinformatics in both the research and clinical environments. This conclusion reflects advantages in deployment scalability and analysis reproducibility, as well as the critical ability to deliver advanced computational statistical procedures machines where access to sensitive data is controlled, that is, without local "download and installation".

  5. Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform, Version 1.5 and 1.x.

    Energy Technology Data Exchange (ETDEWEB)

    2017-05-18

    EDGE bioinformatics was developed to help biologists process Next Generation Sequencing data (in the form of raw FASTQ files), even if they have little to no bioinformatics expertise. EDGE is a highly integrated and interactive web-based platform that is capable of running many of the standard analyses that biologists require for viral, bacterial/archaeal, and metagenomic samples. EDGE provides the following analytical workflows: quality trimming and host removal, assembly and annotation, comparisons against known references, taxonomy classification of reads and contigs, whole genome SNP-based phylogenetic analysis, and PCR analysis. EDGE provides an intuitive web-based interface for user input, allows users to visualize and interact with selected results (e.g. JBrowse genome browser), and generates a final detailed PDF report. Results in the form of tables, text files, graphic files, and PDFs can be downloaded. A user management system allows tracking of an individual’s EDGE runs, along with the ability to share, post publicly, delete, or archive their results.

  6. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  7. Identification of Differentially Expressed Genes Induced by Aberrant Methylation in Oral Squamous Cell Carcinomas Using Integrated Bioinformatic Analysis

    Directory of Open Access Journals (Sweden)

    Xiaoqi Zhang

    2018-06-01

    Full Text Available Oral squamous cell carcinoma (OSCC is a malignant disease. Methylation plays a key role in the etiology and pathogenesis of OSCC. The goal of this study was to identify aberrantly methylated differentially expressed genes (DEGs in OSCCs, and to explore the underlying mechanisms of tumorigenesis by using integrated bioinformatic analysis. Gene expression profiles (GSE30784 and GSE38532 were analyzed using the R software to obtain aberrantly methylated DEGs. Functional enrichment analysis of screened genes was performed using the DAVID software. Protein–protein interaction (PPI networks were constructed using the STRING database. The cBioPortal software was used to exhibit the alterations of genes. Lastly, we validated the results with the Cancer Genome Atlas (TCGA data. Twenty-eight upregulated hypomethylated genes and 24 downregulated hypermethylated genes were identified. These genes were enriched in the biological process of regulation in immune response, and were mainly involved in the PI3K-AKT and EMT pathways. Additionally, three upregulated hypomethylated oncogenes and four downregulated hypermethylated tumor suppressor genes (TSGs were identified. In conclusion, our study indicated possible aberrantly methylated DEGs and pathways in OSCCs, which could improve the understanding of the underlying molecular mechanisms. Aberrantly methylated oncogenes and TSGs may also serve as biomarkers and therapeutic targets for the precise diagnosis and treatment of OSCCs in the future.

  8. Bioinformatics in the Netherlands : The value of a nationwide community

    NARCIS (Netherlands)

    van Gelder, Celia W.G.; Hooft, Rob; van Rijswijk, Merlijn; van den Berg, Linda; Kok, Ruben; Reinders, M.J.T.; Mons, Barend; Heringa, Jaap

    2017-01-01

    This review provides a historical overview of the inception and development of bioinformatics research in the Netherlands. Rooted in theoretical biology by foundational figures such as Paulien Hogeweg (at Utrecht University since the 1970s), the developments leading to organizational structures

  9. Comprehensive metabolic panel

    Science.gov (United States)

    Metabolic panel - comprehensive; Chem-20; SMA20; Sequential multi-channel analysis with computer-20; SMAC20; Metabolic panel 20 ... Chernecky CC, Berger BJ. Comprehensive metabolic panel (CMP) - blood. In: ... Tests and Diagnostic Procedures . 6th ed. St Louis, MO: ...

  10. Use of the Comprehensive Inversion method for Swarm satellite data analysis

    DEFF Research Database (Denmark)

    Sabaka, T. J.; Tøffner-Clausen, Lars; Olsen, Nils

    2013-01-01

    An advanced algorithm, known as the “Comprehensive Inversion” (CI), is presented for the analysis of Swarm measurements to generate a consistent set of Level-2 data products to be delivered by the Swarm “Satellite Constellation Application and Research Facility” (SCARF) to the European Space Agency...

  11. The FaceBase Consortium: A comprehensive program to facilitate craniofacial research

    Science.gov (United States)

    Hochheiser, Harry; Aronow, Bruce J.; Artinger, Kristin; Beaty, Terri H.; Brinkley, James F.; Chai, Yang; Clouthier, David; Cunningham, Michael L.; Dixon, Michael; Donahue, Leah Rae; Fraser, Scott E.; Hallgrimsson, Benedikt; Iwata, Junichi; Klein, Ophir; Marazita, Mary L.; Murray, Jeffrey C.; Murray, Stephen; de Villena, Fernando Pardo-Manuel; Postlethwait, John; Potter, Steven; Shapiro, Linda; Spritz, Richard; Visel, Axel; Weinberg, Seth M.; Trainor, Paul A.

    2012-01-01

    The FaceBase Consortium consists of ten interlinked research and technology projects whose goal is to generate craniofacial research data and technology for use by the research community through a central data management and integrated bioinformatics hub. Funded by the National Institute of Dental and Craniofacial Research (NIDCR) and currently focused on studying the development of the middle region of the face, the Consortium will produce comprehensive datasets of global gene expression patterns, regulatory elements and sequencing; will generate anatomical and molecular atlases; will provide human normative facial data and other phenotypes; conduct follow up studies of a completed genome-wide association study; generate independent data on the genetics of craniofacial development, build repositories of animal models and of human samples and data for community access and analysis; and will develop software tools and animal models for analyzing and functionally testing and integrating these data. The FaceBase website (http://www.facebase.org) will serve as a web home for these efforts, providing interactive tools for exploring these datasets, together with discussion forums and other services to support and foster collaboration within the craniofacial research community. PMID:21458441

  12. Learning Genetics through an Authentic Research Simulation in Bioinformatics

    Science.gov (United States)

    Gelbart, Hadas; Yarden, Anat

    2006-01-01

    Following the rationale that learning is an active process of knowledge construction as well as enculturation into a community of experts, we developed a novel web-based learning environment in bioinformatics for high-school biology majors in Israel. The learning environment enables the learners to actively participate in a guided inquiry process…

  13. Atlas – a data warehouse for integrative bioinformatics

    Directory of Open Access Journals (Sweden)

    Yuen Macaire MS

    2005-02-01

    Full Text Available Abstract Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL calls that are implemented in a set of Application Programming Interfaces (APIs. The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD, Biomolecular Interaction Network Database (BIND, Database of Interacting Proteins (DIP, Molecular Interactions Database (MINT, IntAct, NCBI Taxonomy, Gene Ontology (GO, Online Mendelian Inheritance in Man (OMIM, LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First

  14. RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.

    Science.gov (United States)

    D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano

    2015-01-01

    The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export

  15. Identification Of Protein Vaccine Candidates Using Comprehensive Proteomic Analysis Strategies

    Science.gov (United States)

    2007-12-01

    that fascinating fungus known as Coccidioides. I also want to thank the UA Mass Spectrometry Facility and the UA Proteomics Consortium, especially...W. & N. N. Kav. 2006. The proteome of the phytopathogenic fungus Sclerotinia sclerotiorum. Proteomics 6: 5995-6007. 127. de Godoy, L. M., J. V...IDENTIFICATION OF PROTEIN VACCINE CANDIDATES USING COMPREHENSIVE PROTEOMIC ANALYSIS STRATEGIES by James G. Rohrbough

  16. Quantitative analysis of target components by comprehensive two-dimensional gas chromatography

    NARCIS (Netherlands)

    Mispelaar, V.G. van; Tas, A.C.; Smilde, A.K.; Schoenmakers, P.J.; Asten, A.C. van

    2003-01-01

    Quantitative analysis using comprehensive two-dimensional (2D) gas chromatography (GC) is still rarely reported. This is largely due to a lack of suitable software. The objective of the present study is to generate quantitative results from a large GC x GC data set, consisting of 32 chromatograms.

  17. PyPedia: using the wiki paradigm as crowd sourcing environment for bioinformatics protocols.

    Science.gov (United States)

    Kanterakis, Alexandros; Kuiper, Joël; Potamias, George; Swertz, Morris A

    2015-01-01

    Today researchers can choose from many bioinformatics protocols for all types of life sciences research, computational environments and coding languages. Although the majority of these are open source, few of them possess all virtues to maximize reuse and promote reproducible science. Wikipedia has proven a great tool to disseminate information and enhance collaboration between users with varying expertise and background to author qualitative content via crowdsourcing. However, it remains an open question whether the wiki paradigm can be applied to bioinformatics protocols. We piloted PyPedia, a wiki where each article is both implementation and documentation of a bioinformatics computational protocol in the python language. Hyperlinks within the wiki can be used to compose complex workflows and induce reuse. A RESTful API enables code execution outside the wiki. Initial content of PyPedia contains articles for population statistics, bioinformatics format conversions and genotype imputation. Use of the easy to learn wiki syntax effectively lowers the barriers to bring expert programmers and less computer savvy researchers on the same page. PyPedia demonstrates how wiki can provide a collaborative development, sharing and even execution environment for biologists and bioinformaticians that complement existing resources, useful for local and multi-center research teams. PyPedia is available online at: http://www.pypedia.com. The source code and installation instructions are available at: https://github.com/kantale/PyPedia_server. The PyPedia python library is available at: https://github.com/kantale/pypedia. PyPedia is open-source, available under the BSD 2-Clause License.

  18. Bioinformatics analysis and construction of phylogenetic tree of aquaporins from Echinococcus granulosus.

    Science.gov (United States)

    Wang, Fen; Ye, Bin

    2016-09-01

    Cyst echinococcosis caused by the matacestodal larvae of Echinococcus granulosus (Eg), is a chronic, worldwide, and severe zoonotic parasitosis. The treatment of cyst echinococcosis is still difficult since surgery cannot fit the needs of all patients, and drugs can lead to serious adverse events as well as resistance. The screen of target proteins interacted with new anti-hydatidosis drugs is urgently needed to meet the prevailing challenges. Here, we analyzed the sequences and structure properties, and constructed a phylogenetic tree by bioinformatics methods. The MIP family signature and Protein kinase C phosphorylation sites were predicted in all nine EgAQPs. α-helix and random coil were the main secondary structures of EgAQPs. The numbers of transmembrane regions were three to six, which indicated that EgAQPs contained multiple hydrophobic regions. A neighbor-joining tree indicated that EgAQPs were divided into two branches, seven EgAQPs formed a clade with AQP1 from human, a "strict" aquaporins, other two EgAQPs formed a clade with AQP9 from human, an aquaglyceroporins. Unfortunately, homology modeling of EgAQPs was aborted. These results provide a foundation for understanding and researches of the biological function of E. granulosus.

  19. Watchdog - a workflow management system for the distributed analysis of large-scale experimental data.

    Science.gov (United States)

    Kluge, Michael; Friedel, Caroline C

    2018-03-13

    The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely

  20. Comprehensive large-scale assessment of intrinsic protein disorder.

    Science.gov (United States)

    Walsh, Ian; Giollo, Manuel; Di Domenico, Tomás; Ferrari, Carlo; Zimmermann, Olav; Tosatto, Silvio C E

    2015-01-15

    Intrinsically disordered regions are key for the function of numerous proteins. Due to the difficulties in experimental disorder characterization, many computational predictors have been developed with various disorder flavors. Their performance is generally measured on small sets mainly from experimentally solved structures, e.g. Protein Data Bank (PDB) chains. MobiDB has only recently started to collect disorder annotations from multiple experimental structures. MobiDB annotates disorder for UniProt sequences, allowing us to conduct the first large-scale assessment of fast disorder predictors on 25 833 different sequences with X-ray crystallographic structures. In addition to a comprehensive ranking of predictors, this analysis produced the following interesting observations. (i) The predictors cluster according to their disorder definition, with a consensus giving more confidence. (ii) Previous assessments appear over-reliant on data annotated at the PDB chain level and performance is lower on entire UniProt sequences. (iii) Long disordered regions are harder to predict. (iv) Depending on the structural and functional types of the proteins, differences in prediction performance of up to 10% are observed. The datasets are available from Web site at URL: http://mobidb.bio.unipd.it/lsd. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Pixel-based analysis of comprehensive two-dimensional gas chromatograms (color plots) of petroleum

    DEFF Research Database (Denmark)

    Furbo, Søren; Hansen, Asger B.; Skov, Thomas

    2014-01-01

    We demonstrate how to process comprehensive two-dimensional gas chromatograms (GC × GC chromatograms) to remove nonsample information (artifacts), including background and retention time shifts. We also demonstrate how this, combined with further reduction of the influence of irrelevant informati......, allows for data analysis without integration or peak deconvolution (pixelbased analysis)....

  2. Nanoinformatics: an emerging area of information technology at the intersection of bioinformatics, computational chemistry and nanobiotechnology

    Directory of Open Access Journals (Sweden)

    Fernando González-Nilo

    2011-01-01

    Full Text Available After the progress made during the genomics era, bioinformatics was tasked with supporting the flow of information generated by nanobiotechnology efforts. This challenge requires adapting classical bioinformatic and computational chemistry tools to store, standardize, analyze, and visualize nanobiotechnological information. Thus, old and new bioinformatic and computational chemistry tools have been merged into a new sub-discipline: nanoinformatics. This review takes a second look at the development of this new and exciting area as seen from the perspective of the evolution of nanobiotechnology applied to the life sciences. The knowledge obtained at the nano-scale level implies answers to new questions and the development of new concepts in different fields. The rapid convergence of technologies around nanobiotechnologies has spun off collaborative networks and web platforms created for sharing and discussing the knowledge generated in nanobiotechnology. The implementation of new database schemes suitable for storage, processing and integrating physical, chemical, and biological properties of nanoparticles will be a key element in achieving the promises in this convergent field. In this work, we will review some applications of nanobiotechnology to life sciences in generating new requirements for diverse scientific fields, such as bioinformatics and computational chemistry.

  3. CROSSWORK for Glycans: Glycan Identificatin Through Mass Spectrometry and Bioinformatics

    DEFF Research Database (Denmark)

    Rasmussen, Morten; Thaysen-Andersen, Morten; Højrup, Peter

      We have developed "GLYCANthrope " - CROSSWORKS for glycans:  a bioinformatics tool, which assists in identifying N-linked glycosylated peptides as well as their glycan moieties from MS2 data of enzymatically digested glycoproteins. The program runs either as a stand-alone application or as a plug...

  4. Hidden in the Middle: Culture, Value and Reward in Bioinformatics

    Science.gov (United States)

    Lewis, Jamie; Bartlett, Andrew; Atkinson, Paul

    2016-01-01

    Bioinformatics--the so-called shotgun marriage between biology and computer science--is an interdiscipline. Despite interdisciplinarity being seen as a virtue, for having the capacity to solve complex problems and foster innovation, it has the potential to place projects and people in anomalous categories. For example, valorised…

  5. [Semantic Network Analysis of Online News and Social Media Text Related to Comprehensive Nursing Care Service].

    Science.gov (United States)

    Kim, Minji; Choi, Mona; Youm, Yoosik

    2017-12-01

    As comprehensive nursing care service has gradually expanded, it has become necessary to explore the various opinions about it. The purpose of this study is to explore the large amount of text data regarding comprehensive nursing care service extracted from online news and social media by applying a semantic network analysis. The web pages of the Korean Nurses Association (KNA) News, major daily newspapers, and Twitter were crawled by searching the keyword 'comprehensive nursing care service' using Python. A morphological analysis was performed using KoNLPy. Nodes on a 'comprehensive nursing care service' cluster were selected, and frequency, edge weight, and degree centrality were calculated and visualized with Gephi for the semantic network. A total of 536 news pages and 464 tweets were analyzed. In the KNA News and major daily newspapers, 'nursing workforce' and 'nursing service' were highly rated in frequency, edge weight, and degree centrality. On Twitter, the most frequent nodes were 'National Health Insurance Service' and 'comprehensive nursing care service hospital.' The nodes with the highest edge weight were 'national health insurance,' 'wards without caregiver presence,' and 'caregiving costs.' 'National Health Insurance Service' was highest in degree centrality. This study provides an example of how to use atypical big data for a nursing issue through semantic network analysis to explore diverse perspectives surrounding the nursing community through various media sources. Applying semantic network analysis to online big data to gather information regarding various nursing issues would help to explore opinions for formulating and implementing nursing policies. © 2017 Korean Society of Nursing Science

  6. A middleware-based platform for the integration of bioinformatic services

    Directory of Open Access Journals (Sweden)

    Guzmán Llambías

    2015-08-01

    Full Text Available Performing Bioinformatic´s experiments involve an intensive access to distributed services and information resources through Internet. Although existing tools facilitate the implementation of workflow-oriented applications, they lack of capabilities to integrate services beyond low-scale applications, particularly integrating services with heterogeneous interaction patterns and in a larger scale. This is particularly required to enable a large-scale distributed processing of biological data generated by massive sequencing technologies. On the other hand, such integration mechanisms are provided by middleware products like Enterprise Service Buses (ESB, which enable to integrate distributed systems following a Service Oriented Architecture. This paper proposes an integration platform, based on enterprise middleware, to integrate Bioinformatics services. It presents a multi-level reference architecture and focuses on ESB-based mechanisms to provide asynchronous communications, event-based interactions and data transformation capabilities. The paper presents a formal specification of the platform using the Event-B model.

  7. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    Directory of Open Access Journals (Sweden)

    Shan Li

    2014-01-01

    Full Text Available With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks.

  8. Introducing People – Genre Analysis and Oral Comprehension and Oral Production Tasks

    Directory of Open Access Journals (Sweden)

    Keila Rocha Reis de Carvalho

    2012-02-01

    Full Text Available This paper aims at presenting an analysis of the genre introducing people and at suggesting listening comprehension and oral production tasks. This work was developed according to the characterization of the rhetorical organization of situations taken from seventeen films that contain the genre under analysis. Although several studies in the ESP area carried out recently (Andrade, 2003; Cardoso, 2003; Shergue, 2003; Belmonte, 2003; Serafini, 2003 have identified listening comprehension and oral production as the abilities that should be prioritized in an English course, much needs to be done, especially concerning the oral genres that take into account the language the learners of English as a second language need in their target situation. This work is based on Hutchinson & Waters (1987 theoretical background on ESP, Swales’ (1990 genre analysis, Ramos’ (2004 pedagogical proposal, and also on Ellis´ (2003 tasks concept. The familiarization of learners of English as a second language with this genre will provide them with the opportunity to better understand and use the English language in their academic and professional life.

  9. Revisiting Francisella tularensis subsp. holarctica, Causative Agent of Tularemia in Germany With Bioinformatics: New Insights in Genome Structure, DNA Methylation and Comparative Phylogenetic Analysis

    Directory of Open Access Journals (Sweden)

    Anne Busch

    2018-03-01

    Full Text Available Francisella (F. tularensis is a highly virulent, Gram-negative bacterial pathogen and the causative agent of the zoonotic disease tularemia. Here, we generated, analyzed and characterized a high quality circular genome sequence of the F. tularensis subsp. holarctica strain 12T0050 that caused fatal tularemia in a hare. Besides the genomic structure, we focused on the analysis of oriC, unique to the Francisella genus and regulating replication in and outside hosts and the first report on genomic DNA methylation of a Francisella strain. The high quality genome was used to establish and evaluate a diagnostic whole genome sequencing pipeline. A genotyping strategy for F. tularensis was developed using various bioinformatics tools for genotyping. Additionally, whole genome sequences of F. tularensis subsp. holarctica isolates isolated in the years 2008–2015 in Germany were generated. A phylogenetic analysis allowed to determine the genetic relatedness of these isolates and confirmed the highly conserved nature of F. tularensis subsp. holarctica.

  10. Introducing bioinformatics, the biosciences' genomic revolution

    CERN Document Server

    Zanella, Paolo

    1999-01-01

    The general audience for these lectures is mainly physicists, computer scientists, engineers or the general public wanting to know more about what’s going on in the biosciences. What’s bioinformatics and why is all this fuss being made about it ? What’s this revolution triggered by the human genome project ? Are there any results yet ? What are the problems ? What new avenues of research have been opened up ? What about the technology ? These new developments will be compared with what happened at CERN earlier in its evolution, and it is hoped that the similiraties and contrasts will stimulate new curiosity and provoke new thoughts.

  11. Emergent Computation Emphasizing Bioinformatics

    CERN Document Server

    Simon, Matthew

    2005-01-01

    Emergent Computation is concerned with recent applications of Mathematical Linguistics or Automata Theory. This subject has a primary focus upon "Bioinformatics" (the Genome and arising interest in the Proteome), but the closing chapter also examines applications in Biology, Medicine, Anthropology, etc. The book is composed of an organized examination of DNA, RNA, and the assembly of amino acids into proteins. Rather than examine these areas from a purely mathematical viewpoint (that excludes much of the biochemical reality), the author uses scientific papers written mostly by biochemists based upon their laboratory observations. Thus while DNA may exist in its double stranded form, triple stranded forms are not excluded. Similarly, while bases exist in Watson-Crick complements, mismatched bases and abasic pairs are not excluded, nor are Hoogsteen bonds. Just as there are four bases naturally found in DNA, the existence of additional bases is not ignored, nor amino acids in addition to the usual complement of...

  12. Evaluating the Effectiveness of a Practical Inquiry-Based Learning Bioinformatics Module on Undergraduate Student Engagement and Applied Skills

    Science.gov (United States)

    Brown, James A. L.

    2016-01-01

    A pedagogic intervention, in the form of an inquiry-based peer-assisted learning project (as a practical student-led bioinformatics module), was assessed for its ability to increase students' engagement, practical bioinformatic skills and process-specific knowledge. Elements assessed were process-specific knowledge following module completion,…

  13. Neonatal Informatics: Transforming Neonatal Care Through Translational Bioinformatics

    Science.gov (United States)

    Palma, Jonathan P.; Benitz, William E.; Tarczy-Hornoch, Peter; Butte, Atul J.; Longhurst, Christopher A.

    2012-01-01

    The future of neonatal informatics will be driven by the availability of increasingly vast amounts of clinical and genetic data. The field of translational bioinformatics is concerned with linking and learning from these data and applying new findings to clinical care to transform the data into proactive, predictive, preventive, and participatory health. As a result of advances in translational informatics, the care of neonates will become more data driven, evidence based, and personalized. PMID:22924023

  14. Comprehensive NMR analysis of compositional changes of black garlic during thermal processing.

    Science.gov (United States)

    Liang, Tingfu; Wei, Feifei; Lu, Yi; Kodani, Yoshinori; Nakada, Mitsuhiko; Miyakawa, Takuya; Tanokura, Masaru

    2015-01-21

    Black garlic is a processed food product obtained by subjecting whole raw garlic to thermal processing that causes chemical reactions, such as the Maillard reaction, which change the composition of the garlic. In this paper, we report a nuclear magnetic resonance (NMR)-based comprehensive analysis of raw garlic and black garlic extracts to determine the compositional changes resulting from thermal processing. (1)H NMR spectra with a detailed signal assignment showed that 38 components were altered by thermal processing of raw garlic. For example, the contents of 11 l-amino acids increased during the first step of thermal processing over 5 days and then decreased. Multivariate data analysis revealed changes in the contents of fructose, glucose, acetic acid, formic acid, pyroglutamic acid, cycloalliin, and 5-(hydroxymethyl)furfural (5-HMF). Our results provide comprehensive information on changes in NMR-detectable components during thermal processing of whole garlic.

  15. Multiobjective optimization in bioinformatics and computational biology.

    Science.gov (United States)

    Handl, Julia; Kell, Douglas B; Knowles, Joshua

    2007-01-01

    This paper reviews the application of multiobjective optimization in the fields of bioinformatics and computational biology. A survey of existing work, organized by application area, forms the main body of the review, following an introduction to the key concepts in multiobjective optimization. An original contribution of the review is the identification of five distinct "contexts," giving rise to multiple objectives: These are used to explain the reasons behind the use of multiobjective optimization in each application area and also to point the way to potential future uses of the technique.

  16. Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.

    Science.gov (United States)

    Graña, Osvaldo; López-Fernández, Hugo; Fdez-Riverola, Florentino; González Pisano, David; Glez-Peña, Daniel

    2018-04-15

    High-throughput sequencing of bisulfite-converted DNA is a technique used to measure DNA methylation levels. Although a considerable number of computational pipelines have been developed to analyze such data, none of them tackles all the peculiarities of the analysis together, revealing limitations that can force the user to manually perform additional steps needed for a complete processing of the data. This article presents bicycle, an integrated, flexible analysis pipeline for bisulfite sequencing data. Bicycle analyzes whole genome bisulfite sequencing data, targeted bisulfite sequencing data and hydroxymethylation data. To show how bicycle overtakes other available pipelines, we compared them on a defined number of features that are summarized in a table. We also tested bicycle with both simulated and real datasets, to show its level of performance, and compared it to different state-of-the-art methylation analysis pipelines. Bicycle is publicly available under GNU LGPL v3.0 license at http://www.sing-group.org/bicycle. Users can also download a customized Ubuntu LiveCD including bicycle and other bisulfite sequencing data pipelines compared here. In addition, a docker image with bicycle and its dependencies, which allows a straightforward use of bicycle in any platform (e.g. Linux, OS X or Windows), is also available. ograna@cnio.es or dgpena@uvigo.es. Supplementary data are available at Bioinformatics online.

  17. The secondary metabolite bioinformatics portal

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk

    2016-01-01

    . In this context, this review gives a summary of tools and databases that currently are available to mine, identify and characterize natural product biosynthesis pathways and their producers based on ‘omics data. A web portal called Secondary Metabolite Bioinformatics Portal (SMBP at http...... analytical and chemical methods gave access to this group of compounds, nowadays genomics-based methods offer complementary approaches to find, identify and characterize such molecules. This paradigm shift also resulted in a high demand for computational tools to assist researchers in their daily work......Natural products are among the most important sources of lead molecules for drug discovery. With the development of affordable whole-genome sequencing technologies and other ‘omics tools, the field of natural products research is currently undergoing a shift in paradigms. While, for decades, mainly...

  18. Comprehensive two-dimensional gas chromatography for the analysis of organohalogenated micro-contaminants

    NARCIS (Netherlands)

    Korytar, P.; Haglund, P.; Boer, de J.; Brinkman, U.A.Th.

    2006-01-01

    We explain the principles of comprehensive two-dimensional gas chromatography (GC × GC), and discuss key instrumental aspects - with emphasis on column combinations and mass spectrometric detection. As the main item of interest, we review the potential of GC × GC for the analysis of

  19. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    Directory of Open Access Journals (Sweden)

    Deodutta Roy

    2015-10-01

    Full Text Available We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PCBs, bisphenols (BPs, and phthalates. Meta-analysis showed that PCBs exposure, not Bisphenol A (BPA and phthalates, increased the summary odds ratio for breast cancer and endometriosis. Bioinformatics analysis of gene-EDC interactions and disease associations identified several hundred genes that were altered by exposure to PCBs, phthalate or BPA. EDCs-modified genes in breast neoplasms and endometriosis are part of steroid hormone signaling and inflammation pathways. All three EDCs–PCB 153, phthalates, and BPA influenced five common genes—CYP19A1, EGFR, ESR2, FOS, and IGF1—in breast cancer as well as in endometriosis. These genes are environmentally and estrogen responsive, altered in human breast and uterine tumors and endometriosis lesions, and part of Mitogen Activated Protein Kinase (MAPK signaling pathways in cancer. Our findings suggest that breast cancer and endometriosis share some common environmental and molecular risk factors.

  20. Key Concept Identification: A Comprehensive Analysis of Frequency and Topical Graph-Based Approaches

    Directory of Open Access Journals (Sweden)

    Muhammad Aman

    2018-05-01

    Full Text Available Automatic key concept extraction from text is the main challenging task in information extraction, information retrieval and digital libraries, ontology learning, and text analysis. The statistical frequency and topical graph-based ranking are the two kinds of potentially powerful and leading unsupervised approaches in this area, devised to address the problem. To utilize the potential of these approaches and improve key concept identification, a comprehensive performance analysis of these approaches on datasets from different domains is needed. The objective of the study presented in this paper is to perform a comprehensive empirical analysis of selected frequency and topical graph-based algorithms for key concept extraction on three different datasets, to identify the major sources of error in these approaches. For experimental analysis, we have selected TF-IDF, KP-Miner and TopicRank. Three major sources of error, i.e., frequency errors, syntactical errors and semantical errors, and the factors that contribute to these errors are identified. Analysis of the results reveals that performance of the selected approaches is significantly degraded by these errors. These findings can help us develop an intelligent solution for key concept extraction in the future.

  1. Hydrocarbon Fuel Thermal Performance Modeling based on Systematic Measurement and Comprehensive Chromatographic Analysis

    Science.gov (United States)

    2016-07-31

    distribution unlimited Hydrocarbon Fuel Thermal Performance Modeling based on Systematic Measurement and Comprehensive Chromatographic Analysis Matthew...vital importance for hydrocarbon -fueled propulsion systems: fuel thermal performance as indicated by physical and chemical effects of cooling passage... analysis . The selection and acquisition of a set of chemically diverse fuels is pivotal for a successful outcome since test method validation and

  2. Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks.

    Science.gov (United States)

    Pauling, Josch; Klipp, Edda

    2016-12-22

    Lipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.

  3. A review of bioinformatic methods for forensic DNA analyses.

    Science.gov (United States)

    Liu, Yao-Yuan; Harbison, SallyAnn

    2018-03-01

    Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Kubernetes as an approach for solving bioinformatic problems.

    OpenAIRE

    Markstedt, Olof

    2017-01-01

    The cluster orchestration tool Kubernetes enables easy deployment and reproducibility of life science research by utilizing the advantages of the container technology. The container technology allows for easy tool creation, sharing and runs on any Linux system once it has been built. The applicability of Kubernetes as an approach to run bioinformatic workflows was evaluated and resulted in some examples of how Kubernetes and containers could be used within the field of life science and how th...

  5. Diagnostic and therapeutic implications of genetic heterogeneity in myeloid neoplasms uncovered by comprehensive mutational analysis

    Directory of Open Access Journals (Sweden)

    Sarah M. Choi

    2017-01-01

    Full Text Available While growing use of comprehensive mutational analysis has led to the discovery of innumerable genetic alterations associated with various myeloid neoplasms, the under-recognized phenomenon of genetic heterogeneity within such neoplasms creates a potential for diagnostic confusion. Here, we describe two cases where expanded mutational testing led to amendment of an initial diagnosis of chronic myelogenous leukemia with subsequent altered treatment of each patient. We demonstrate the power of comprehensive testing in ensuring appropriate classification of genetically heterogeneous neoplasms, and emphasize thoughtful analysis of molecular and genetic data as an essential component of diagnosis and management.

  6. Comprehensive Analysis Competence and Innovative Approaches for Sustainable Chemical Production.

    Science.gov (United States)

    Appel, Joerg; Colombo, Corrado; Dätwyler, Urs; Chen, Yun; Kerimoglu, Nimet

    2016-01-01

    Humanity currently sees itself facing enormous economic, ecological, and social challenges. Sustainable products and production in specialty chemistry are an important strategic element to address these megatrends. In addition to that, digitalization and global connectivity will create new opportunities for the industry. One aspect is examined in this paper, which shows the development of comprehensive analysis of production networks for a more sustainable production in which the need for innovative solutions arises. Examples from data analysis, advanced process control and automated performance monitoring are shown. These efforts have significant impact on improved yields, reduced energy and water consumption, and better product performance in the application of the products.

  7. A compilation of Web-based research tools for miRNA analysis.

    Science.gov (United States)

    Shukla, Vaibhav; Varghese, Vinay Koshy; Kabekkodu, Shama Prasada; Mallya, Sandeep; Satyamoorthy, Kapaettu

    2017-09-01

    Since the discovery of microRNAs (miRNAs), a class of noncoding RNAs that regulate the gene expression posttranscriptionally in sequence-specific manner, there has been a release of number of tools useful for both basic and advanced applications. This is because of the significance of miRNAs in many pathophysiological conditions including cancer. Numerous bioinformatics tools that have been developed for miRNA analysis have their utility for detection, expression, function, target prediction and many other related features. This review provides a comprehensive assessment of web-based tools for the miRNA analysis that does not require prior knowledge of any computing languages. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  8. Using MetaboAnalyst 3.0 for Comprehensive Metabolomics Data Analysis.

    Science.gov (United States)

    Xia, Jianguo; Wishart, David S

    2016-09-07

    MetaboAnalyst (http://www.metaboanalyst.ca) is a comprehensive Web application for metabolomic data analysis and interpretation. MetaboAnalyst handles most of the common metabolomic data types from most kinds of metabolomics platforms (MS and NMR) for most kinds of metabolomics experiments (targeted, untargeted, quantitative). In addition to providing a variety of data processing and normalization procedures, MetaboAnalyst also supports a number of data analysis and data visualization tasks using a range of univariate, multivariate methods such as PCA (principal component analysis), PLS-DA (partial least squares discriminant analysis), heatmap clustering and machine learning methods. MetaboAnalyst also offers a variety of tools for metabolomic data interpretation including MSEA (metabolite set enrichment analysis), MetPA (metabolite pathway analysis), and biomarker selection via ROC (receiver operating characteristic) curve analysis, as well as time series and power analysis. This unit provides an overview of the main functional modules and the general workflow of the latest version of MetaboAnalyst (MetaboAnalyst 3.0), followed by eight detailed protocols. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  9. Fiat lux! Phylogeny and bioinformatics shed light on GABA functions in plants.

    Science.gov (United States)

    Renault, Hugues

    2013-06-01

    The non-protein amino acid γ-aminobutyric acid (GABA) accumulates in plants in response to a wide variety of environmental cues. Recent data point toward an involvement of GABA in tricarboxylic acid (TCA) cycle activity and respiration, especially in stressed roots. To gain further insights into potential GABA functions in plants, phylogenetic and bioinformatic approaches were undertaken. Phylogenetic reconstruction of the GABA transaminase (GABA-T) protein family revealed the monophyletic nature of plant GABA-Ts. However, this analysis also pointed to the common origin of several plant aminotransferases families, which were found more similar to plant GABA-Ts than yeast and human GABA-Ts. A computational analysis of AtGABA-T co-expressed genes was performed in roots and in stress conditions. This second approach uncovered a strong connection between GABA metabolism and glyoxylate cycle during stress. Both in silico analyses open new perspectives and hypotheses for GABA metabolic functions in plants.

  10. Data mining in bioinformatics using Weka.

    Science.gov (United States)

    Frank, Eibe; Hall, Mark; Trigg, Len; Holmes, Geoffrey; Witten, Ian H

    2004-10-12

    The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it. http://www.cs.waikato.ac.nz/ml/weka.

  11. XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services.

    Science.gov (United States)

    Wagener, Johannes; Spjuth, Ola; Willighagen, Egon L; Wikberg, Jarl E S

    2009-09-04

    Life sciences make heavily use of the web for both data provision and analysis. However, the increasing amount of available data and the diversity of analysis tools call for machine accessible interfaces in order to be effective. HTTP-based Web service technologies, like the Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST) services, are today the most common technologies for this in bioinformatics. However, these methods have severe drawbacks, including lack of discoverability, and the inability for services to send status notifications. Several complementary workarounds have been proposed, but the results are ad-hoc solutions of varying quality that can be difficult to use. We present a novel approach based on the open standard Extensible Messaging and Presence Protocol (XMPP), consisting of an extension (IO Data) to comprise discovery, asynchronous invocation, and definition of data types in the service. That XMPP cloud services are capable of asynchronous communication implies that clients do not have to poll repetitively for status, but the service sends the results back to the client upon completion. Implementations for Bioclipse and Taverna are presented, as are various XMPP cloud services in bio- and cheminformatics. XMPP with its extensions is a powerful protocol for cloud services that demonstrate several advantages over traditional HTTP-based Web services: 1) services are discoverable without the need of an external registry, 2) asynchronous invocation eliminates the need for ad-hoc solutions like polling, and 3) input and output types defined in the service allows for generation of clients on the fly without the need of an external semantics description. The many advantages over existing technologies make XMPP a highly interesting candidate for next generation online services in bioinformatics.

  12. Comprehension-Driven Program Analysis (CPA) for Malware Detection in Android Phones

    Science.gov (United States)

    2015-07-01

    Android source . 3.1.2.2 Analyzers An analyzer conforms to specifications defined by the Security Toolbox. Specifically an analyzer encapsulates a...COMPREHENSION-DRIVEN PROGRAM ANALYSIS (CPA) FOR MALWARE DETECTION IN ANDROID PHONES IOWA STATE UNIVERSITY JULY 2015 FINAL...average 1 hour per response, including the time for reviewing instructions, searching existing data sources , gathering and maintaining the data needed

  13. Gene expression profile analysis of colorectal cancer to investigate potential mechanisms using bioinformatics

    Directory of Open Access Journals (Sweden)

    Kou YB

    2015-04-01

    Full Text Available Yubin Kou,1,2* Suya Zhang,3* Xiaoping Chen,2 Sanyuan Hu1 1Department of General Surgery, Qilu Hospital of Shandong University, Jinan, People’s Republic of China; 2Department of General Surgery, 3Department of Neurology, Shuguang Hospital Baoshan Branch, Shanghai, People’s Republic of China *These authors contributed equally to this work Abstract: This study aimed to explore the underlying molecular mechanisms of colorectal cancer (CRC using bioinformatics analysis. Using GSE4107 datasets downloaded from the Gene Expression Omnibus, the differentially expressed genes (DEGs were screened by comparing the RNA expression from the colonic mucosa between 12 CRC patients and ten healthy controls using a paired t-test. The Gene Ontology (GO functional and pathway enrichment analyses of DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID software followed by the construction of a protein–protein interaction (PPI network. In addition, hub gene identification and GO functional and pathway enrichment analyses of the modules were performed. A total of 612 up- and 639 downregulated genes were identified. The upregulated DEGs were mainly involved in the regulation of cell growth, migration, and the MAPK signaling pathway. The downregulated DEGs were significantly associated with oxidative phosphorylation, Alzheimer’s disease, and Parkinson’s disease. Moreover, FOS, FN1, PPP1CC, and CYP2B6 were selected as hub genes in the PPI networks. Two modules (up-A and up-B in the upregulated PPI network and three modules (d-A, d-B, and d-C in the downregulated PPI were identified with the threshold of Molecular Complex Detection (MCODE Molecular Complex Detection (MCODE score ≥4 and nodes ≥6. The genes in module up-A were significantly enriched in neuroactive ligand–receptor interactions and the calcium signaling pathway. The genes in module d-A were enriched in four pathways, including oxidative

  14. Bioinformatics and structural characterization of a hypothetical protein from Streptococcus mutans

    DEFF Research Database (Denmark)

    Nan, Jie; Brostromer, Erik; Liu, Xiang-Yu

    2009-01-01

    . From the interlinking structural and bioinformatics studies, we have concluded that SMU.440 could be involved in polyketide-like antibiotic resistance, providing a better understanding of this hypothetical protein. Besides, the combination of multiple methods in this study can be used as a general...

  15. ANLN functions as a key candidate gene in cervical cancer as determined by integrated bioinformatic analysis

    Directory of Open Access Journals (Sweden)

    Xia L

    2018-04-01

    DEGs PPI network complex, contained 305 nodes and 4,962 edges, and 8 clusters were calculated according to k-core =2. Among them, cluster 1, which had 65 nodes and 1,780 edges, had the highest score in these clusters. In coexpression analysis, there were 86 hubgenes from the Brown modules that were chosen for further analysis. Sixty-one key genes were identified as the intersecting genes of the Brown module of WGCNA and DEGs. In survival analysis, only ANLN was a prognostic factor, and the survival was significantly better in the low-expression ANLN group.Conclusion: Our study suggested that ANLN may be a potential tumor oncogene and could serve as a biomarker for predicting the prognosis of cervical cancer patients. Keywords: bioinformatics analysis, cervical cancer, WGCNA, ANLN 

  16. FungiDB: An Integrated Bioinformatic Resource for Fungi and Oomycetes

    Directory of Open Access Journals (Sweden)

    Evelina Y. Basenko

    2018-03-01

    Full Text Available FungiDB (fungidb.org is a free online resource for data mining and functional genomics analysis for fungal and oomycete species. FungiDB is part of the Eukaryotic Pathogen Genomics Database Resource (EuPathDB, eupathdb.org platform that integrates genomic, transcriptomic, proteomic, and phenotypic datasets, and other types of data for pathogenic and nonpathogenic, free-living and parasitic organisms. FungiDB is one of the largest EuPathDB databases containing nearly 100 genomes obtained from GenBank, Aspergillus Genome Database (AspGD, The Broad Institute, Joint Genome Institute (JGI, Ensembl, and other sources. FungiDB offers a user-friendly web interface with embedded bioinformatics tools that support custom in silico experiments that leverage FungiDB-integrated data. In addition, a Galaxy-based workspace enables users to generate custom pipelines for large-scale data analysis (e.g., RNA-Seq, variant calling, etc.. This review provides an introduction to the FungiDB resources and focuses on available features, tools, and queries and how they can be used to mine data across a diverse range of integrated FungiDB datasets and records.

  17. Model-driven user interfaces for bioinformatics data resources: regenerating the wheel as an alternative to reinventing it

    Directory of Open Access Journals (Sweden)

    Swainston Neil

    2006-12-01

    Full Text Available Abstract Background The proliferation of data repositories in bioinformatics has resulted in the development of numerous interfaces that allow scientists to browse, search and analyse the data that they contain. Interfaces typically support repository access by means of web pages, but other means are also used, such as desktop applications and command line tools. Interfaces often duplicate functionality amongst each other, and this implies that associated development activities are repeated in different laboratories. Interfaces developed by public laboratories are often created with limited developer resources. In such environments, reducing the time spent on creating user interfaces allows for a better deployment of resources for specialised tasks, such as data integration or analysis. Laboratories maintaining data resources are challenged to reconcile requirements for software that is reliable, functional and flexible with limitations on software development resources. Results This paper proposes a model-driven approach for the partial generation of user interfaces for searching and browsing bioinformatics data repositories. Inspired by the Model Driven Architecture (MDA of the Object Management Group (OMG, we have developed a system that generates interfaces designed for use with bioinformatics resources. This approach helps laboratory domain experts decrease the amount of time they have to spend dealing with the repetitive aspects of user interface development. As a result, the amount of time they can spend on gathering requirements and helping develop specialised features increases. The resulting system is known as Pierre, and has been validated through its application to use cases in the life sciences, including the PEDRoDB proteomics database and the e-Fungi data warehouse. Conclusion MDAs focus on generating software from models that describe aspects of service capabilities, and can be applied to support rapid development of repository

  18. The Effects of Visual Attention Span and Phonological Decoding in Reading Comprehension in Dyslexia: A Path Analysis

    OpenAIRE

    Chen, C.; Schneps, M.; Masyn, K.; Thomson, J.

    2016-01-01

    Increasing evidence has shown visual attention span to be a factor, distinct from phonological skills, that explains single-word identification (pseudo-word/word reading) performance in dyslexia. Yet, little is known about how well visual attention span explains text comprehension. Observing reading comprehension in a sample of 105 high school students with dyslexia, we used a pathway analysis to examine the direct and indirect path between visual attention span and reading comprehension whil...

  19. Bioinformatics prediction of swine MHC class I epitopes from Porcine Reproductive and Respiratory Syndrome Virus

    DEFF Research Database (Denmark)

    Welner, Simon; Nielsen, Morten; Lund, Ole

    an effective CTL response against PRRSV, we have taken a bioinformatics approach to identify common PRRSV epitopes predicted to react broadly with predominant swine MHC (SLA) alleles. First, the genomic integrity and sequencing method was examined for 334 available complete PRRSV type 2 genomes leaving 104...... by the PopCover algorithm, providing a final list of 54 epitopes prioritized according to maximum coverage of PRRSV strains and SLA alleles. This bioinformatics approach provides a rational strategy for selecting peptides for a CTL-activating vaccine with broad coverage of both virus and swine diversity...

  20. A comprehensive iterative approach is highly effective in diagnosing individuals who are exome negative.

    Science.gov (United States)

    Shashi, Vandana; Schoch, Kelly; Spillmann, Rebecca; Cope, Heidi; Tan, Queenie K-G; Walley, Nicole; Pena, Loren; McConkie-Rosell, Allyn; Jiang, Yong-Hui; Stong, Nicholas; Need, Anna C; Goldstein, David B

    2018-06-15

    Sixty to seventy-five percent of individuals with rare and undiagnosed phenotypes remain undiagnosed after exome sequencing (ES). With standard ES reanalysis resolving 10-15% of the ES negatives, further approaches are necessary to maximize diagnoses in these individuals. In 38 ES negative patients an individualized genomic-phenotypic approach was employed utilizing (1) phenotyping; (2) reanalyses of FASTQ files, with innovative bioinformatics; (3) targeted molecular testing; (4) genome sequencing (GS); and (5) conferring of clinical diagnoses when pathognomonic clinical findings occurred. Certain and highly likely diagnoses were made in 18/38 (47%) individuals, including identifying two new developmental disorders. The majority of diagnoses (>70%) were due to our bioinformatics, phenotyping, and targeted testing identifying variants that were undetected or not prioritized on prior ES. GS diagnosed 3/18 individuals with structural variants not amenable to ES. Additionally, tentative diagnoses were made in 3 (8%), and in 5 individuals (13%) candidate genes were identified. Overall, diagnoses/potential leads were identified in 26/38 (68%). Our comprehensive approach to ES negatives maximizes the ES and clinical data for both diagnoses and candidate gene identification, without GS in the majority. This iterative approach is cost-effective and is pertinent to the current conundrum of ES negatives.